Compare commits

...

192 Commits

Author SHA1 Message Date
James Hush
1d3ae6b029 TTSSpeakFrame context demo 2025-10-30 16:13:26 +08:00
Mark Backman
7872fa2e88 Merge pull request #2934 from roshie548/add-cartesia-generation-config
feat: add generation_config support for Cartesia Sonic-3
2025-10-29 23:10:48 -04:00
Roshan
e86c546a1a Merge branch 'main' into add-cartesia-generation-config 2025-10-29 18:31:09 -07:00
Roshan
abf34bcccf address pr comments 2025-10-29 18:29:51 -07:00
Aleix Conchillo Flaqué
56eb633390 Merge pull request #2911 from pipecat-ai/aleix/daily-transport-improve-error-handling
DailyTransport: update start_dialout/start_recording return values
2025-10-29 16:28:10 -07:00
Aleix Conchillo Flaqué
6299b9db87 DailyTransport: trigger "on_error" if transcription fails to start/stop 2025-10-29 16:25:13 -07:00
Aleix Conchillo Flaqué
bcffa590a3 DailyTransport: update start_dialout/start_recording return values 2025-10-29 16:25:13 -07:00
kompfner
8b739aa444 Merge pull request #2889 from pipecat-ai/pk/openai-realtime-universal-llmcontext-2
Support new `LLMContext` pattern with `OpenAIRealtimeLLMService`
2025-10-29 16:54:37 -04:00
Paul Kompfner
8f15980c67 Get rid of unnecessary new task in example file 2025-10-29 16:23:50 -04:00
Paul Kompfner
89e9acf0e1 CHANGELOG and code comment tweaks 2025-10-29 16:21:04 -04:00
Paul Kompfner
ddac24e6c9 Fix a missing space in a warning message 2025-10-29 16:17:05 -04:00
Paul Kompfner
d0f52feba3 OpenAI Realtime needs the assistant context aggregator to have expect_stripped_words=False 2025-10-29 16:15:16 -04:00
Paul Kompfner
8894db4290 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Add warning about no longer pushing `TTSTextFrame`s.
2025-10-29 15:45:06 -04:00
Paul Kompfner
1f96cdf970 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Make `LLMUserAggregator` push the `LLMSetToolsFrame`s, in case a speech-to-speech service that needs to handle the frame itself—like `OpenAIRealtimeLLMService`—is downstream. As far as I can tell, pushing `LLMSetToolsFrame` should otherwise have no unwanted side effects.
2025-10-29 15:43:51 -04:00
Paul Kompfner
0282033208 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Add `LLMContext.get_messages_for_persistent_storage()` for compatibility with `OpenAILLMContext`, to avoid tripping up users who we're unknowingly migrating to using `LLMContext`.
2025-10-29 15:43:51 -04:00
Paul Kompfner
917ea27352 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update `AzureRealtimeLLMService` example (19a) to use new `LLMContext` pattern.
2025-10-29 15:43:51 -04:00
Paul Kompfner
8c03df1463 Update some docstring arg descriptions to be a bit more current or accurate 2025-10-29 15:43:51 -04:00
Paul Kompfner
15aa76efba Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Maintain backward compatibility with functions specified in dict format.
2025-10-29 15:43:51 -04:00
Paul Kompfner
8ac421f8fd Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Remove unused imports.
2025-10-29 15:43:51 -04:00
Paul Kompfner
75b3ea9c96 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Fix tracing.
2025-10-29 15:43:51 -04:00
Paul Kompfner
95be1510ac Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Improve `OpenAIRealtimeLLMAdapter.get_messages_for_logging()`.
2025-10-29 15:43:51 -04:00
Paul Kompfner
df19011080 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Improve warning about transcription frame direction change.
2025-10-29 15:43:51 -04:00
Paul Kompfner
e42cf78e79 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update deprecation versions.
2025-10-29 15:43:51 -04:00
Paul Kompfner
0495de52b6 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Log warning about transcription frame direction change.
2025-10-29 15:43:51 -04:00
Paul Kompfner
9bc02afd0d Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
CHANGELOG tweak.
2025-10-29 15:43:51 -04:00
Paul Kompfner
6140fdb2c9 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
In anticipation of `messages` property being added to `LLMContext` (in another PR), remove warnings about the need to use `get_messages()` instead.
2025-10-29 15:43:51 -04:00
Paul Kompfner
b6a1886dae Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd). 2025-10-29 15:43:51 -04:00
Paul Kompfner
42d0a097c5 Tweaks to 20b example 2025-10-29 15:43:51 -04:00
Paul Kompfner
3761804146 Make OpenAIRealtimeLLMService's websocket send method more resilient. Previously, it was possible for a websocket send attempt to occur during a disconnect. 2025-10-29 15:43:51 -04:00
Paul Kompfner
46e97c57c2 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update 20b example to use new `LLMContext` pattern.
2025-10-29 15:43:51 -04:00
Paul Kompfner
19770b76b4 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Add back file that was removed, when it should've just been deprecated.

Also, fix version numbers in deprecation messages to match the next expected release.
2025-10-29 15:43:51 -04:00
Paul Kompfner
b34461bf93 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd). 2025-10-29 15:43:47 -04:00
Paul Kompfner
bab0aaf585 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update `create_context_aggregator()` (which we're keeping around for backward compatibility) to create a `LLMContextAggregatorPair` rather than OpenAI-Realtime-specific aggregators.
2025-10-29 15:36:58 -04:00
Paul Kompfner
61944d22ef Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Implement sending tool call results to the OpenAI server based on reading context updates. This lets us use the normal assistant context aggregator and not a special OpenAI Realtime subclass that pushes up a special frame for function call results.
2025-10-29 15:36:58 -04:00
Paul Kompfner
47756319be Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Receiving a new context (via a context frame) no longer serves as a signal to reset the conversation. That’s because we’re now receiving new contexts from the user aggregator every time new messages are added, and from the assistant aggregator when function call results come in. The code pattern we're heading towards, of “diffing” each new context with the previous on, sets us up for doing more sophisticated things in the future, like sending specific messages to OpenAI to edit its internally-tracked context.

Also, remove code that was directly modifying context.
2025-10-29 15:36:58 -04:00
Paul Kompfner
5fa56df014 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update 19b example with new pattern.
2025-10-29 15:36:58 -04:00
Paul Kompfner
8a151235c3 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Deprecate `send_transcription_frames`—transcription frames are now always sent.
2025-10-29 15:36:57 -04:00
Paul Kompfner
ec42f8c24e Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Push `TranscriptionFrame`s upstream, to be handled by the user context aggregator. This will require at least a couple of other changes:
- Updating examples to put transcript processors upstream from `OpenAIRealtimeLLMService`
- Maybe figuring out a way to preserve backward compatibility with existing pipelines that put transcript processors downstream from `OpenAIRealtimeLLMService`
- Updating `OpenAIRealtimeLLMService` to ignore new received context frames, since the upstream user context aggregator will generate those after each newly-added user message; hopefully nobody was reliant on the old behavior of resetting the conversation upon receiving a new context!
2025-10-29 15:36:57 -04:00
Paul Kompfner
29fd17b9ff Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Avoid pushing `LLMTextFrame` when `OpenAIRealtimeLLMService` is configured to output audio. This avoids duplicate text in assistant messages in context. Conceptually, a speech-to-speech service encapsulates TTS behavior; in a "traditional" pipeline, `LLMTextFrames` are swallowed by the TTS service, so they should similarly not be pushed by a speech-to-speech service. Only. `TTSTextFrame`s should be pushed.
2025-10-29 15:36:57 -04:00
Paul Kompfner
3ea1e357f2 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (initial part of work) 2025-10-29 15:36:57 -04:00
kompfner
351ef617ae Merge pull request #2932 from pipecat-ai/pk/gemini-live-universal-llmcontext
Update `GeminiLLMService` to work with `LLMContext` and `LLMContextAg…
2025-10-29 15:35:13 -04:00
Paul Kompfner
9dafb715c4 Update some deprecation versions 2025-10-29 15:30:43 -04:00
Paul Kompfner
82d494d3d4 Fix a bug in GeminiLiveLLMService related to ending gracefully—i.e. waiting for the bot to stop responding before ending the pipeline—when the service is configured with the TEXT modality 2025-10-29 14:34:02 -04:00
Mark Backman
e893aaa620 Merge pull request #2931 from ivaaan/hume-bugfix
Hume: use Octave v1 if description provided
2025-10-29 13:40:21 -04:00
Paul Kompfner
65c17a698e Whoops - fix a bug in GeminiLiveLLMService where we weren't checking if a tool call result was already handled before reporting it to the LLM 2025-10-29 12:44:00 -04:00
Paul Kompfner
615aae5b95 Fix GeminiLiveLLMService's sending of LLMFullResponseStartFrame and LLMFullResponseEndFrame so that they properly bookend responses.
Properly bookended responses now work with:
- AUDIO modality (validated with 26b example)
- TEXT modality (validated with 26d example)
- AUDIO modality with Vertex AI (validated with 26h example)

It doesn't seem that TEXT modality is supported with Vertex AI, hence the missing "quadrant" of validation.
2025-10-29 12:33:37 -04:00
Ivan A
2f1061f300 Merge branch 'main' into hume-bugfix 2025-10-29 17:06:50 +01:00
ivaaan
9307079af2 upd changelog 2025-10-29 17:05:41 +01:00
Mark Backman
efa64642a4 Merge pull request #2930 from pipecat-ai/mb/simli-constructor-update
Update Simli to align with Pipecat constructor norms
2025-10-29 11:50:11 -04:00
Mark Backman
ede6c32149 Update Simli to align with Pipecat constructor norms 2025-10-29 11:47:23 -04:00
Roshan
b0f5fc02c4 refactor: use Pydantic BaseModel for GenerationConfig and simplify model_dump()
- Change GenerationConfig from dataclass to Pydantic BaseModel for consistency
- Simplify _build_msg() to use model_dump(exclude_none=True) instead of manual field extraction
- Simplify HTTP run_tts() to use model_dump(exclude_none=True) instead of manual field extraction

This addresses feedback from code review and reduces code duplication.
2025-10-28 18:41:58 -07:00
Aleix Conchillo Flaqué
493d6bf91e Merge pull request #2936 from pipecat-ai/aleix/daily-python-0.21.0
pyproject: update daily-python to 0.21.0
2025-10-28 18:25:25 -07:00
Aleix Conchillo Flaqué
aaebcae2e8 pyproject: update daily-python to 0.21.0 2025-10-28 17:23:37 -07:00
Roshan
408264a0fd docs: update CHANGELOG.md for generation_config feature 2025-10-28 15:16:49 -07:00
Roshan
df8aa3e4b0 feat: add generation_config support for Cartesia Sonic-3
Add GenerationConfig dataclass with volume, speed, and emotion parameters
for Cartesia Sonic-3 TTS models. This enables fine-grained control over
speech generation including volume (0.5-2.0), speed (0.6-1.5), and
emotion (60+ options).

Changes:
- Add GenerationConfig dataclass with proper Google-style docstrings
- Update CartesiaTTSService.InputParams to include generation_config
- Update CartesiaHttpTTSService.InputParams to include generation_config
- Modify _build_msg() to include generation_config in WebSocket messages
- Modify run_tts() to include generation_config in HTTP requests
- Maintain backward compatibility with existing speed and emotion parameters

The legacy speed (literal strings) and emotion (list) parameters remain
available for non-Sonic-3 models.
2025-10-28 15:10:46 -07:00
Mark Backman
4d82a1260b Merge pull request #2933 from pipecat-ai/mb/remove-aiohttp-session-sarvam
Remove aiohttp_session arg from SarvamTTSService
2025-10-28 16:54:56 -04:00
Paul Kompfner
f974c66e12 Update GeminiLLMService to work with LLMContext and LLMContextAggregatorPair 2025-10-28 15:46:28 -04:00
Mark Backman
533372ed37 Remove aiohttp_session arg from SarvamTTSService 2025-10-28 15:39:14 -04:00
ivaaan
a9118eb2cd use Octave 1 if description provided 2025-10-28 20:36:34 +01:00
Aleix Conchillo Flaqué
84ed2468e5 Merge pull request #2924 from pipecat-ai/aleix/daily-transport-remove-join-timeout
DailyTransport: don't timeout prematurely on join
2025-10-28 10:43:28 -07:00
Aleix Conchillo Flaqué
d82d855c20 DailyTransport: don't timeout prematurely on leave 2025-10-28 10:41:19 -07:00
Mark Backman
412ff2a4a1 Merge pull request #2929 from pipecat-ai/mb/cartesia-sonic-3
Update Cartesia's default model to sonic-3
2025-10-28 13:07:28 -04:00
Mark Backman
82ccc160fb Merge pull request #2923 from pipecat-ai/mb/runner-no-proxy-required
Remove development runner requirement for proxy
2025-10-28 11:59:38 -04:00
Mark Backman
9ef60bd468 Update Cartesia's default model to sonic-3 2025-10-28 11:49:54 -04:00
Aleix Conchillo Flaqué
f3c4bf08dd DailyTransport: don't timeout prematurely on join 2025-10-27 17:52:19 -07:00
Mark Backman
f2cfbee3c3 Remove development runner requirement for proxy 2025-10-27 16:18:31 -04:00
Vanessa Pyne
8b063116ab Merge pull request #2921 from pipecat-ai/vp-azure-ex-cleanup
cleanup logger message
2025-10-27 12:59:08 -05:00
vipyne
8096e62b34 cleanup logger message 2025-10-27 11:27:30 -05:00
kompfner
20f4b0e8ff Merge pull request #2914 from pipecat-ai/pk/gemini-function-calling-fixes
Gemini function calling fixes
2025-10-27 09:45:29 -04:00
Paul Kompfner
6feaf91789 Fix a bug in GeminiLLMAdapter's handling of Gemini-specific context messages 2025-10-27 09:42:24 -04:00
Mark Backman
91d3ae07b3 Merge pull request #2915 from Rickaym/fix--rounding-the-edges-of-observer-function-method-deprecation
fix: use correct  property names
2025-10-24 19:42:34 -04:00
Pyae Sone Myo
71841f71ef fix: use correct property names 2025-10-25 00:47:46 +06:30
Paul Kompfner
949b807023 Close genai client more gracefully to avoid printed warnings. We're now following the genai library guidance: https://github.com/googleapis/python-genai?tab=readme-ov-file#close-a-client 2025-10-24 11:36:25 -04:00
Paul Kompfner
4ad15f9a01 Update Gemini service to include function name when sending function responses in context 2025-10-24 11:04:52 -04:00
Paul Kompfner
99d94fc625 Update Gemini service to use "user" role for function responses, as shown in the Gemini docs 2025-10-24 10:05:14 -04:00
Mark Backman
a3d630c0d1 Merge pull request #2908 from pipecat-ai/mb/runner-daily-start-route
fix: add support for DAILY_SAMPLE_ROOM_URL when calling /start for Da…
2025-10-23 14:15:42 -04:00
Mark Backman
04b482c445 Merge branch 'main' into mb/runner-daily-start-route 2025-10-23 14:11:38 -04:00
Mark Backman
b2bce4916f Merge pull request #2900 from pipecat-ai/mb/quickstart-pipecat-cli
Quickstart to use Pipecat CLI
2025-10-23 10:55:42 -04:00
Mark Backman
60e9817f16 fix: add support for DAILY_SAMPLE_ROOM_URL when calling /start for DailyTransport 2025-10-22 16:48:30 -04:00
kompfner
c655d0d313 Merge pull request #2907 from pipecat-ai/mb/service-switcher-updates
ServiceSwitcher updates
2025-10-22 11:23:48 -04:00
Paul Kompfner
ea6e146f2d Update TestServiceSwitcher to exercise targeting system frames only to the active service 2025-10-22 11:14:27 -04:00
Mark Backman
ec890a834f Rename to filter_system_frames 2025-10-22 11:01:33 -04:00
Mark Backman
5b921fc054 fix: FunctionFilter adds block_system_frame arg 2025-10-22 10:53:01 -04:00
Mark Backman
f1040100f4 Update ServiceSwitcher and LLMSwitcher docstrings 2025-10-22 10:51:03 -04:00
Mark Backman
54691ee781 Merge pull request #2904 from pipecat-ai/mb/bump-aws-sdk-bedrock-runtime
Upgrade aws_sdk_bedrock_runtime to v0.1.1
2025-10-22 08:58:48 -04:00
Mark Backman
49239a23c6 Upgrade aws_sdk_bedrock_runtime to v0.1.1 2025-10-21 23:27:38 -04:00
Aleix Conchillo Flaqué
e0c43de13f Merge pull request #2903 from pipecat-ai/aleix/pipecat-0.0.91
update CHANGELOG for 0.0.91
2025-10-21 19:09:23 -07:00
Aleix Conchillo Flaqué
cc4c96d099 update CHANGELOG for 0.0.91 2025-10-21 19:00:51 -07:00
Aleix Conchillo Flaqué
788465cb04 Merge pull request #2901 from pipecat-ai/pk/llmcontext-messages
Add `messages` property to `LLMContext` for usage parity with `OpenAI…
2025-10-21 18:00:33 -07:00
Aleix Conchillo Flaqué
db934eade0 Merge pull request #2891 from pipecat-ai/aleix/daily-pipecat-runner-args
runner: allow starting a bot from Daily's /start endpoint
2025-10-21 17:59:13 -07:00
Mark Backman
0b8c966a11 Merge pull request #2892 from pipecat-ai/mb/aws-llm-claude-fix
fix: AWSBedrockLLMService compatibility for newer Claude models
2025-10-21 20:50:20 -04:00
Mark Backman
5849485bc6 fix: AWSBedrockLLMService compatibility for newer Claude models 2025-10-21 19:47:27 -04:00
Aleix Conchillo Flaqué
459af58540 runner: allow starting a bot from Daily's /start endpoint 2025-10-21 16:28:11 -07:00
Aleix Conchillo Flaqué
576bd67e85 runner: add body field to RunnerArguments 2025-10-21 16:27:48 -07:00
Aleix Conchillo Flaqué
1e8629bf96 runner: allow passing an api_key to configure 2025-10-21 16:27:48 -07:00
Paul Kompfner
776a3526f9 Add messages property to LLMContext for usage parity with OpenAILLMContext.
This wasn't really an issue before, when folks were *knowingly* migrating from `OpenAILLMContext` to `LLMContext`. But in the latest AWS Nova Sonic change, we're swapping it out from under folks, so this kind of compatibility is more important.

For context, the reason we *didn't* offer the `messages` property earlier was to aid in the development of `LLMContext`—we wanted to draw attention to all the places where messages were being read from context, so we could find the places where we might need to pass an argument to the read.
2025-10-21 17:38:39 -04:00
kompfner
2ced044418 Merge pull request #2896 from pipecat-ai/pk/add-back-types-that-were-meant-to-be-deprecated-not-removed
Add back types that were removed, when they should only have been dep…
2025-10-21 17:33:17 -04:00
Mark Backman
151f187837 Merge pull request #2895 from pipecat-ai/mb/update-env-example
Organize the env.example file
2025-10-21 17:15:22 -04:00
Mark Backman
67afa718d0 Merge pull request #2898 from pipecat-ai/mb/ellipses-changelog
Changelog entry for PR #2877
2025-10-21 17:02:08 -04:00
Mark Backman
52ab0eccc0 Quickstart to use Pipecat CLI 2025-10-21 15:57:45 -04:00
Vanessa Pyne
d1f1b68b71 Merge pull request #2863 from pipecat-ai/vp-custom-frame-processor-ex
add 08-custom-frame-processor.py to foundational examples
2025-10-21 14:15:38 -05:00
Mark Backman
a479c32665 Merge pull request #2894 from pipecat-ai/mb/cli-readme
Add Pipecat CLI to README's ecosystem section
2025-10-21 13:20:12 -04:00
Mark Backman
9f66b0ba41 Add Pipecat CLI to README's ecosystem section 2025-10-21 13:17:37 -04:00
vipyne
23385ca3d2 replace foundational example 08-bots-arguing.py with 08-custom-frame-processor.py 2025-10-21 11:56:35 -05:00
vipyne
8b24bae9c5 pr notes 2025-10-21 11:42:06 -05:00
Mark Backman
0502ec6c44 Changelog entry for PR #2877 2025-10-21 11:58:27 -04:00
Mark Backman
81645910e0 Merge pull request #2877 from nimobeeren/patch-1
Add ellipsis character to sentence ending punctuation list
2025-10-21 11:53:17 -04:00
Filipi da Silva Fuchter
d6ab4c41b0 Merge pull request #2897 from pipecat-ai/filipi/fix_proxy_route
Fixed an issue in the runner's proxy_request
2025-10-21 12:28:04 -03:00
Filipi Fuchter
2f92cb8781 Fixed an issue in the runner's proxy_request where a session that exists but has empty data was being treated as invalid. 2025-10-21 11:41:52 -03:00
Paul Kompfner
fbf274374c Add back types that were removed, when they should only have been deprecated 2025-10-21 09:56:31 -04:00
Mark Backman
427efecf5b Organize the env.example file 2025-10-21 09:43:46 -04:00
Filipi da Silva Fuchter
b3e54546ac Merge pull request #2888 from pipecat-ai/filipi/rtvi_duplicated_frames
Fixed an issue where the RTVIProcessor was sending duplicate UserStartedSpeakingFrame and UserStoppedSpeakingFrame messages.
2025-10-21 08:57:32 -03:00
Filipi Fuchter
de46631bac Fixed an issue where the RTVIProcessor was sending duplicate UserStartedSpeakingFrame and UserStoppedSpeakingFrame messages. 2025-10-20 18:39:00 -03:00
vipyne
abf0150261 add 47-custom-frame-processor.py to foundational examples 2025-10-20 12:11:40 -05:00
Aleix Conchillo Flaqué
a0c93ab6de update CHANGELOG cosmetics 2025-10-20 09:07:50 -07:00
Aleix Conchillo Flaqué
4bec566bbf Merge pull request #2885 from pipecat-ai/aleix/daily-python-0.20.0
pyproject: update daily-python to 0.20.0
2025-10-20 08:04:52 -07:00
Aleix Conchillo Flaqué
ec3cd24182 pyproject: update daily-python to 0.20.0 2025-10-20 08:04:34 -07:00
kompfner
e36e64c2e8 Merge pull request #2750 from pipecat-ai/pk/aws-nova-sonic-universal-llmcontext-1
Support new `LLMContext` pattern with `AWSNovaSonicLLMService`
2025-10-20 10:12:53 -04:00
Paul Kompfner
02a88022dd Add a bit more detail to CHANGELOG related to AWSNovaSonicLLMService's support for LLMContext 2025-10-20 10:06:09 -04:00
Paul Kompfner
6cae61f2cc Add a bit more detail to CHANGELOG entry about AWSNovaSonicLLMService's support for LLMContext 2025-10-20 09:50:23 -04:00
Paul Kompfner
3b40079120 Add a detailed warning when trying to import things from pipecat.services.aws_nova_sonic.context or pipecat.services.aws.nova_sonic.context 2025-10-20 09:49:05 -04:00
Paul Kompfner
ff0b38859b Remove AWS Nova Sonic's context.py, which was always concerned with types for internal use only. Now those types are either gone or moved elsewhere. 2025-10-20 09:49:05 -04:00
Paul Kompfner
4d499324d1 Re-apply a change to AWSNovaSonicLLMService that was lost in a rebase 2025-10-20 09:49:05 -04:00
Paul Kompfner
f13e006db2 Bump version in deprecation message in docstring 2025-10-20 09:49:05 -04:00
Paul Kompfner
87d9e8c9cd Re-apply a couple of recent changes to AWSNovaSonicLLMService that were lost in a rebase 2025-10-20 09:49:05 -04:00
Paul Kompfner
4820f1c059 Address some AWSNovaSonicLLMService context-recording edge cases 2025-10-20 09:49:05 -04:00
Paul Kompfner
860c39d1b1 Get rid of LLMContext.get_messages_for_persistent_storage().
The reason for its `system_instruction` argument was to support usage with LLMs where you might pass the system instruction as a parameter to the `LLMService` rather than specifying it in the context.

But as I thought about it more I became unconvinced that the `system_instruction` argument was really beneficial:

- If you specified your system instruction in your context in the first place, it'll still be there when you read messages for persistent storage
- If you didn't specify your system instruction in the context and instead passed it in as an `LLMService` parameter, you most likely *don't* want it to be in the context when you read messages for persistent storage
- ...and if you really really do need to inject it at the start of the context, it's quite easy to do anyway

And if we remove the `system_instruction` argument from `get_messages_for_persistent_storage()`, then it's essentially just `get_messages()`.
2025-10-20 09:49:05 -04:00
Paul Kompfner
ae5c5ed7f6 Update AWSNovaSonicLLMService to work with LLMContext and LLMContextAggregatorPair 2025-10-20 09:49:00 -04:00
Aleix Conchillo Flaqué
7aa01c1ca8 Merge pull request #2882 from pipecat-ai/aleix/base-transport-output-cleanup
base output transport cleanup
2025-10-18 07:38:13 -07:00
Mark Backman
4d6356748f Merge pull request #2819 from shreyas-sarvam/sarvam/tts-v3
feat: Add support for bulbul:v3
2025-10-18 09:36:57 -04:00
Mark Backman
5b1a182421 Merge branch 'main' into sarvam/tts-v3 2025-10-18 09:34:10 -04:00
Mark Backman
6ac0c34413 Merge pull request #2879 from sam-s10s/fix/smx-vocab
Fix for SpeechmaticsSTTService AdditionVocabEntry entries
2025-10-18 09:27:23 -04:00
Mark Backman
c115422dbf Merge pull request #2857 from dan-ince-aai/main
feat: add keyterms_prompt to AssemblyAI service
2025-10-18 09:20:43 -04:00
Mark Backman
a2a973be27 Merge pull request #2842 from nbyers-altira/fix-riva-segmented
Fix NVIDIA Riva Segmented STT by adding missing is_final parameter to _handle_transcription
2025-10-18 09:11:11 -04:00
Aleix Conchillo Flaqué
0407744950 BaseOutputTransport: simplify process_frame 2025-10-17 21:55:20 -07:00
Aleix Conchillo Flaqué
7ce370ccc6 BaseOutputTransport: simplify bot speaking logic 2025-10-17 15:13:20 -07:00
nbyers-altira
a4867f61aa be a tad more precise in changelog 2025-10-17 13:51:49 -04:00
nbyers-altira
a67a765783 add changelog, run linter 2025-10-17 13:49:52 -04:00
nbyers-altira
81221668b1 Merge remote-tracking branch 'upstream/main' into fix-riva-segmented 2025-10-17 13:45:59 -04:00
Daniel Ince
cc9c264940 Merge branch 'main' into main 2025-10-17 15:15:36 +01:00
Sam Sykes
f2c61ac9fd Fix for AdditionVocabEntry without sounds_like items. 2025-10-17 14:34:37 +01:00
Filipi da Silva Fuchter
88f8c10f63 Merge pull request #2875 from pipecat-ai/filipi/rtvi_routes
Creating the WebRTC routes that mimic the ones provided by Pipecat Cloud.
2025-10-17 10:13:45 -03:00
Filipi Fuchter
855f4842dd Creating the WebRTC routes that mimic the ones provided by Pipecat Cloud. 2025-10-17 10:10:19 -03:00
Filipi da Silva Fuchter
2bf44fe2af Merge pull request #2853 from pipecat-ai/filipi/trickle_ice
Adding support for trickle ice.
2025-10-17 09:00:32 -03:00
Filipi Fuchter
3e8a7cc254 Adding support for trickle ICE to the SmallWebRTCTransport. 2025-10-17 08:57:45 -03:00
Daniel Ince
a600c05570 Merge branch 'main' into main 2025-10-17 11:43:38 +01:00
dan-ince-aai
3ba6b55659 feat: multilingual + changelog updates 2025-10-17 11:38:03 +01:00
dan-ince-aai
d5f2dcfac0 lint 2025-10-17 11:32:06 +01:00
Nimo Beeren
d1d74c571c add ellipsis character to sentence ending punctuation list 2025-10-17 10:38:06 +02:00
shreyas-sarvam
d12134038b chore: Update CHANGELOG 2025-10-17 10:07:58 +05:30
shreyas-sarvam
a22af3a7e0 Merge branch 'main' into sarvam/stt 2025-10-17 10:00:49 +05:30
Aleix Conchillo Flaqué
76e07c6c48 Merge pull request #2870 from pipecat-ai/aleix/openaitts-update-settings
OpenAITTSService: allow updating instructions and speed
2025-10-16 13:21:12 -07:00
Aleix Conchillo Flaqué
8d8503bca7 OpenAITTSService: allow updating instructions and speed 2025-10-16 13:20:49 -07:00
Aleix Conchillo Flaqué
a444097060 Merge pull request #2872 from pipecat-ai/aleix/pipeline-task-cancellation-fixes
PipelineTask: fix task cancellation issues
2025-10-16 13:18:13 -07:00
Aleix Conchillo Flaqué
1b9e96c016 PipelineTask: fix task cancellation issues 2025-10-16 13:16:19 -07:00
Vanessa Pyne
7967bc53c3 Merge pull request #2868 from pipecat-ai/vp-whatsapp-dep-mv
only import whatsapp deps if using whatsapp runner
2025-10-16 14:16:28 -05:00
vipyne
6381335346 Add --whatsapp flag to runner 2025-10-16 14:15:26 -05:00
vipyne
0fd5d26104 add WHATSAPP_APP_SECRET to required whatsapp env vars 2025-10-16 10:37:56 -05:00
vipyne
41f817bf04 only import whatsapp deps if using whatsapp runner 2025-10-16 10:37:56 -05:00
shreyas-sarvam
27115e6565 Merge branch 'main' into sarvam/tts-v3 2025-10-16 12:09:50 +05:30
Mark Backman
3c4807d7d4 Merge pull request #2859 from pipecat-ai/mb/openai-package-upgrade
Bump openai, openpipe versions, add 14x foundational example
2025-10-15 15:41:32 -04:00
Mark Backman
8902f1dc94 Bump openai, openpipe versions, add 14x foundational example 2025-10-15 15:17:22 -04:00
Mark Backman
a25333ee51 Merge pull request #2856 from pipecat-ai/mb/pr-2840-cleanup
Fix an issue in ElevenLabsHttpTTSService where the last word is not e…
2025-10-15 15:16:43 -04:00
Mark Backman
82c7d7ad83 Merge pull request #2867 from pipecat-ai/mb/update-moondream-readme
Update moondream chatbot README link
2025-10-15 15:16:19 -04:00
Mark Backman
ba2ab51ef7 Merge pull request #2866 from pipecat-ai/mb/add-sentry-foundational
Add foundation 47-sentry-metrics.py
2025-10-15 15:15:52 -04:00
Mark Backman
22557fa668 Fix an issue in ElevenLabsHttpTTSService where the last word is not emitted 2025-10-15 15:13:54 -04:00
Vanessa Pyne
3fbf59e7c6 Merge pull request #2864 from pipecat-ai/vp-trace-log
WhatsApp transport debug log -> trace log
2025-10-15 13:03:58 -05:00
vipyne
129ab5ea0e WhatsApp transport debug log -> trace log 2025-10-15 13:02:57 -05:00
Aleix Conchillo Flaqué
dc917523d0 Merge pull request #2855 from pipecat-ai/aleix/stt-tts-connected-disconnected-events
services: added on_connected/on_disconnected events
2025-10-15 10:41:38 -07:00
Aleix Conchillo Flaqué
5ea7cc9d32 services: added on_connected/on_disconnected events 2025-10-15 10:39:31 -07:00
Mark Backman
e11ede475b Update moondream chatbot README link 2025-10-15 13:22:56 -04:00
Mark Backman
90d29e04af Merge pull request #2861 from pipecat-ai/mb/11labs-http-apply-text-normalization-fix
fix: set apply_text_normalization as request parameter in ElevenLabsH…
2025-10-15 12:59:36 -04:00
Mark Backman
4c67136a8d Merge pull request #2858 from pipecat-ai/mb/daily-runner-room-properties
Add room_properties to the Daily runner configure() method
2025-10-15 12:58:18 -04:00
Mark Backman
9d78402a33 fix: set apply_text_normalization as request parameter in ElevenLabsHttpTTSService 2025-10-15 12:56:42 -04:00
Mark Backman
73877218e9 Add room_properties to the Daily runner configure() method 2025-10-15 12:55:19 -04:00
Mark Backman
6a1be90cbb Merge pull request #2862 from pipecat-ai/mb/11labs-http-aggregate-sentences
Add aggregate_sentences arg to ElevenLabsHttpTTSService
2025-10-15 12:54:23 -04:00
Aleix Conchillo Flaqué
fbac959ecb Merge pull request #2865 from pipecat-ai/aleix/stop-audio-filter-also-on-cancel
BaseInputTransport: stop audio filter on cancel
2025-10-15 09:53:24 -07:00
Aleix Conchillo Flaqué
18dd85431c Merge pull request #2854 from pipecat-ai/aleix/cartesia-stt-service-websocket
CartesiaSTTService to inherit from WebsocketSTTService
2025-10-15 09:51:42 -07:00
Aleix Conchillo Flaqué
abc569b3d2 examples(foundational/07): use CartesiaSTTService 2025-10-15 09:46:57 -07:00
Mark Backman
fa5d4ecf86 Add foundation 47-sentry-metrics.py 2025-10-15 12:45:07 -04:00
Aleix Conchillo Flaqué
83b0dc39f7 BaseInputTransport: stop audio filter on cancel 2025-10-15 09:22:48 -07:00
Mark Backman
0c31b5ef19 Add aggregate_sentences arg to ElevenLabsHttpTTSService 2025-10-15 11:49:26 -04:00
dan-ince-aai
d16c36c56d feat: add keyterms_prompt to AssemblyAI service 2025-10-15 14:27:52 +01:00
Mark Backman
8fe3bcd484 Merge pull request #2840 from Rickaym/fix--excess-space-in-elevelabs-word-timestamp-joins
fix: handle ElevenLabs partial word concatenation across alignment chunks gracefully
2025-10-15 09:01:05 -04:00
Aleix Conchillo Flaqué
be2858bfbb CartesiaSTTService: inherit from WebsocketSTTService 2025-10-14 14:14:57 -07:00
Pyae Sone Myo
b6b0997553 fix: add support for partial words 2025-10-14 23:06:13 +06:30
Pyae Sone Myo
3b751322d3 fix: add interruption reset for partial word states 2025-10-14 23:04:09 +06:30
nbyers-altira
cc66ac14f1 add is_final to segmented func. sig. instead so tracing is consistent 2025-10-13 10:48:41 -04:00
nbyers-altira
9ddec0f8b4 is_final is not part of the segmented _handle_transcription function signature 2025-10-13 10:44:25 -04:00
shreyas-sarvam
9babfe9fd9 refactor: Improve code reability and replace deprecated interruption frames 2025-10-13 08:54:29 +05:30
Pyae Sone Myo
21d8d148b8 fix: handle partial words across alignment chunks gracefully 2025-10-12 22:10:11 +06:30
shreyas-sarvam
7c1e2793c5 feat: Add support for bulbul:v3 and bulbul:v3-beta 2025-10-09 18:26:22 +05:30
100 changed files with 4156 additions and 1612 deletions

View File

@@ -9,15 +9,350 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added
- Added `generation_config` parameter support to `CartesiaTTSService` and
`CartesiaHttpTTSService` for Cartesia Sonic-3 models. Includes a new
`GenerationConfig` class with `volume` (0.5-2.0), `speed` (0.6-1.5),
and `emotion` (60+ options) parameters for fine-grained speech generation
control.
- Expanded support for univeral `LLMContext` to `OpenAIRealtimeLLMService`.
As a reminder, the context-setup pattern when using `LLMContext` is:
```python
context = LLMContext(messages, tools)
context_aggregator = LLMContextAggregatorPair(
context,
# This part is `OpenAIRealtimeLLMService`-specific.
# `expect_stripped_words=False` needed when OpenAI Realtime used with
# "audio" modality (the default).
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
```
(Note that even though `OpenAIRealtimeLLMService` now supports the universal
`LLMContext`, it is not meant to be swapped out for another LLM service at
runtime with `LLMSwitcher`.)
Note: `TranscriptionFrame`s and `InterimTranscriptionFrame`s now go upstream
from `OpenAIRealtimeLLMService`, so if you're using `TranscriptProcessor`,
say, you'll want to adjust accordingly:
```python
pipeline = Pipeline(
[
transport.input(),
context_aggregator.user(),
# BEFORE
llm,
transcript.user(),
# AFTER
transcript.user(),
llm,
transport.output(),
transcript.assistant(),
context_aggregator.assistant(),
]
)
```
Also worth noting: whether or not you use the new context-setup pattern with
`OpenAIRealtimeLLMService`, some types have changed under the hood:
```python
## BEFORE:
# Context aggregator type
context_aggregator: OpenAIContextAggregatorPair
# Context frame type
frame: OpenAILLMContextFrame
# Context type
context: OpenAIRealtimeLLMContext
# or
context: OpenAILLMContext
## AFTER:
# Context aggregator type
context_aggregator: LLMContextAggregatorPair
# Context frame type
frame: LLMContextFrame
# Context type
context: LLMContext
```
Also note that `RealtimeMessagesUpdateFrame` and
`RealtimeFunctionCallResultFrame` have been deprecated, since they're no
longer used by `OpenAIRealtimeLLMService`. OpenAI Realtime now works more
like other LLM services in Pipecat, relying on updates to its context, pushed
by context aggregators, to update its internal state. Listen for
`LLMContextFrame`s for context updates.
Finally, `LLMTextFrame`s are no longer pushed from `OpenAIRealtimeLLMService`
when it's configured with `output_modalities=['audio']`. If you need
to process its output, listen for `TTSTextFrame`s instead.
- Expanded support for universal `LLMContext` to `GeminiLiveLLMService`.
As a reminder, the context-setup pattern when using `LLMContext` is:
```python
context = LLMContext(messages, tools)
context_aggregator = LLMContextAggregatorPair(
context,
# This part is `GeminiLiveLLMService`-specific.
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default).
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
```
(Note that even though `GeminiLiveLLMService` now supports the universal
`LLMContext`, it is not meant to be swapped out for another LLM service at
runtime with `LLMSwitcher`.)
Worth noting: whether or not you use the new context-setup pattern with
`GeminiLiveLLMService`, some types have changed under the hood:
```python
## BEFORE:
# Context aggregator type
context_aggregator: GeminiLiveContextAggregatorPair
# Context frame type
frame: OpenAILLMContextFrame
# Context type
context: GeminiLiveLLMContext
# or
context: OpenAILLMContext
## AFTER:
# Context aggregator type
context_aggregator: LLMContextAggregatorPair
# Context frame type
frame: LLMContextFrame
# Context type
context: LLMContext
```
Also note that `LLMTextFrame`s are no longer pushed from `GeminiLiveLLMService`
when it's configured with `modalities=GeminiModalities.AUDIO`. If you need
to process its output, listen for `TTSTextFrame`s instead.
### Changed
- `DailyTransport` triggers `on_error` event if transcription can't be started
or stopped.
- `DailyTransport` updates: `start_dialout()` now returns two values:
`session_id` and `error`. `start_recording()` now returns two values:
`stream_id` and `error`.
- Updated `daily-python` to 0.21.0.
- `SimliVideoService` now accepts `api_key` and `face_id` parameters directly,
with optional `params` for `max_session_length` and `max_idle_time`
configuration, aligning with other Pipecat service patterns.
- Updated the default model to `sonic-3` for `CartesiaTTSService` and
`CartesiaHttpTTSService`.
- `FunctionFilter` now has a `filter_system_frames` arg, which controls whether
or not SystemFrames are filtered.
- Upgraded `aws_sdk_bedrock_runtime` to v0.1.1 to resolve potential CPU issues
when running `AWSNovaSonicLLMService`.
### Deprecated
- The `send_transcription_frames` argument to `OpenAIRealtimeLLMService` is
deprecated. Transcription frames are now always sent. They go upstream, to be
handled by the user context aggregator. See "Added" section for details.
- Types in `pipecat.services.openai.realtime.context` and
`pipecat.services.openai.realtime.frames` are deprecated, as they're no
longer used by `OpenAIRealtimeLLMService`. See "Added" section for details.
- `SimliVideoService` `simli_config` parameter is deprecated. Use `api_key` and
`face_id` parameters instead.
### Removed
- Removed the `aiohttp_session` arg from `SarvamTTSService` as it's no longer
used.
### Fixed
- Fixed an issue where `DailyTransport` would timeout prematurely on join and on
leave.
- Fixed an issue in the runner where starting a DailyTransport room via
`/start` didn't support using the `DAILY_SAMPLE_ROOM_URL` env var.
- Fixed an issue in `ServiceSwitcher` where the `STTService`s would result in
all STT services producing `TranscriptionFrame`s.
- Fixed an issue in `HumeTTSService` that was only using Octave 2, which does not support the `description` field. Now, if a description is provided, it switches to Octave 1.
## [0.0.91] - 2025-10-21
### Added
- It is now possible to start a bot from the `/start` endpoint when using the
runner Daily's transport. This follows the Pipecat Cloud format with
`createDailyRoom` and `body` fields in the POST request body.
- Added an ellipsis character (``) to the end of sentence detection in the
string utils.
- Expanded support for universal `LLMContext` to `AWSNovaSonicLLMService`.
As a reminder, the context-setup pattern when using `LLMContext` is:
```python
context = LLMContext(messages, tools)
context_aggregator = LLMContextAggregatorPair(context)
```
(Note that even though `AWSNovaSonicLLMService` now supports the universal
`LLMContext`, it is not meant to be swapped out for another LLM service at
runtime with `LLMSwitcher`.)
Worth noting: whether or not you use the new context-setup pattern with
`AWSNovaSonicLLMService`, some types have changed under the hood:
```python
## BEFORE:
# Context aggregator type
context_aggregator: AWSNovaSonicContextAggregatorPair
# Context frame type
frame: OpenAILLMContextFrame
# Context type
context: AWSNovaSonicLLMContext
# or
context: OpenAILLMContext
## AFTER:
# Context aggregator type
context_aggregator: LLMContextAggregatorPair
# Context frame type
frame: LLMContextFrame
# Context type
context: LLMContext
```
- Added support for `bulbul:v3` model in `SarvamTTSService` and
`SarvamHttpTTSService`.
- Added `keyterms_prompt` parameter to `AssemblyAIConnectionParams`.
- Added `speech_model` parameter to `AssemblyAIConnectionParams` to access the
multilingual model.
- Added support for trickle ICE to the `SmallWebRTCTransport`.
- Added support for updating `OpenAITTSService` settings (`instructions` and
`speed`) at runtime via `TTSUpdateSettingsFrame`.
- Added `--whatsapp` flag to runner to better surface WhatsApp transport logs.
- Added `on_connected` and `on_disconnected` events to TTS and STT
websocket-based services.
- Added an `aggregate_sentences` arg in `ElevenLabsHttpTTSService`, where the
default value is True.
- Added a `room_properties` arg to the Daily runner's `configure()` method,
allowing `DailyRoomProperties` to be provided.
- The runner `--folder` argument now supports downloading files from
subdirectories.
### Changed
- `RunnerArguments` now include the `body` field, so there's no need to add it
to subclasses. Also, all `RunnerArguments` fields are now keyword-only.
- `CartesiaSTTService` now inherits from `WebsocketSTTService`.
- Package upgrades:
- `daily-python` upgraded to 0.20.0.
- `openai` upgraded to support up to 2.x.x.
- `openpipe` upgraded to support up to 5.x.x.
- `SpeechmaticsSTTService` updated dependencies for `speechmatics-rt>=0.5.0`.
### Deprecated
- The `send_transcription_frames` argument to `AWSNovaSonicLLMService` is
deprecated. Transcription frames are now always sent. They go upstream, to be
handled by the user context aggregator. See "Added" section for details.
- Types in `pipecat.services.aws.nova_sonic.context` are deprecated, as they're
no longer used by `AWSNovaSonicLLMService`. See "Added" section for
details.
### Fixed
- Fixed an issue where the `RTVIProcessor` was sending duplicate
`UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame` messages.
- Fixed an issue in `AWSBedrockLLMService` where both `temperature` and `top_p`
were always sent together, causing conflicts with models like Claude Sonnet 4.5
that don't allow both parameters simultaneously. The service now only includes
inference parameters that are explicitly set, and `InputParams` defaults have
been changed to `None` to rely on AWS Bedrock's built-in model defaults.
- Fixed an issue in `RivaSegmentedSTTService` where a runtime error occurred due
to a mismatch in the `_handle_transcription` method's signature.
- Fixed multiple pipeline task cancellation issues. `asyncio.CancelledError` is
now handled properly in `PipelineTask` making it possible to cancel an asyncio
task that it's executing a `PipelineRunner` cleanly. Also,
`PipelineTask.cancel()` does not block anymore waiting for the `CancelFrame`
to reach the end of the pipeline (going back to the behavior in < 0.0.83).
- Fixed an issue in `ElevenLabsTTSService` and `ElevenLabsHttpTTSService` where
the Flash models would split words, resulting in a space being inserted
between words.
- Fixed an issue where audio filters' `stop()` would not be called when using
`CancelFrame`.
- Fixed an issue in `ElevenLabsHttpTTSService`, where
`apply_text_normalization` was incorrectly set as a query parameter. It's now
being added as a request parameter.
- Fixed an issue where `RimeHttpTTSService` and `PiperTTSService` could generate
incorrectly 16-bit aligned audio frames, potentially leading to internal
errors or static audio.
- Fixed an issue in `SpeechmaticsSTTService` where `AdditionalVocabEntry` items
needed to have `sounds_like` for the session to start.
### Other
- Added foundational example `47-sentry-metrics.py`, demonstrating how to use the
`SentryMetrics` processor.
- Added foundational example `14x-function-calling-openpipe.py`.
## [0.0.90] - 2025-10-10
### Added

View File

@@ -44,6 +44,10 @@ Looking to build structured conversations? Check out [Pipecat Flows](https://git
Want to build beautiful and engaging experiences? Checkout the [Voice UI Kit](https://github.com/pipecat-ai/voice-ui-kit), a collection of components, hooks and templates for building voice AI applications quickly.
### 🛠️ Create and deploy projects
Create a new project in under a minute with the [Pipecat CLI](https://github.com/pipecat-ai/pipecat-cli). Then use the CLI to monitor and deploy your agent to production.
### 🔍 Debugging
Looking for help debugging your pipeline and processors? Check out [Whisker](https://github.com/pipecat-ai/whisker), a real-time Pipecat debugger.
@@ -63,24 +67,24 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/storytelling-chatbot/image.png" width="400" /></a>
<br/>
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/translation-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/moondream-chatbot/image.png" width="400" /></a>
<a href="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/12-describe-video.py"><img src="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/assets/moondream.png" width="400" /></a>
</p>
## 🧩 Available services
| Category | Services |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Category | Services |
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Serializers | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx) |
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Serializers | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx) |
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)

View File

@@ -4,6 +4,9 @@ AICOUSTICS_LICENSE_KEY=...
# Anthropic
ANTHROPIC_API_KEY=...
# Assembly AI
ASSEMBLYAI_API_KEY=...
# Async
ASYNCAI_API_KEY=...
ASYNCAI_VOICE_ID=...
@@ -21,12 +24,19 @@ AZURE_CHATGPT_API_KEY=...
AZURE_CHATGPT_ENDPOINT=https://...
AZURE_CHATGPT_MODEL=...
AZURE_REALTIME_API_KEY=...
AZURE_REALTIME_BASE_URL=...
AZURE_DALLE_API_KEY=...
AZURE_DALLE_ENDPOINT=https://...
AZURE_DALLE_MODEL=...
# Cartesia
CARTESIA_API_KEY=...
CARTESIA_VOICE_ID=...
# Cerebras
CEREBRAS_API_KEY=...
# Daily
DAILY_API_KEY=...
@@ -35,57 +45,48 @@ DAILY_SAMPLE_ROOM_URL=https://...
# Deepgram
DEEPGRAM_API_KEY=...
# DeepSeek
DEEPSEEK_API_KEY=...
# ElevenLabs
ELEVENLABS_API_KEY=...
ELEVENLABS_VOICE_ID=...
# Neuphonic
NEUPHONIC_API_KEY=...
# Fal
FAL_KEY=...
# Fireworks
FIREWORKS_API_KEY=...
# Fish Audio
FISH_API_KEY=...
# Gladia
GLADIA_API_KEY=...
GLADIA_REGION=...
# Google
GOOGLE_API_KEY=...
GOOGLE_CLOUD_PROJECT_ID=...
GOOGLE_TEST_CREDENTIALS=...
GOOGLE_VERTEX_TEST_CREDENTIALS=...
GOOGLE_CLOUD_PROJECT_ID=...
GOOGLE_CLOUD_LOCATION=...
GOOGLE_TEST_CREDENTIALS=...
# Grok
GROK_API_KEY=...
# Groq
GROQ_API_KEY=...
# Heygen
HEYGEN_API_KEY=...
# Hume
HUME_API_KEY=...
HUME_VOICE_ID=...
# LMNT
LMNT_API_KEY=...
LMNT_VOICE_ID=...
# Perplexity
PERPLEXITY_API_KEY=...
# PlayHT
PLAYHT_USER_ID=...
PLAYHT_API_KEY=...
# OpenAI
OPENAI_API_KEY=...
# OpenPipe
OPENPIPE_API_KEY=...
# Tavus
TAVUS_API_KEY=...
TAVUS_REPLICA_ID=...
TAVUS_PERSONA_ID=...
# Simli
SIMLI_API_KEY=...
SIMLI_FACE_ID=...
# Inworld
INWORLD_API_KEY=...
# Krisp
KRISP_MODEL_PATH=...
@@ -93,77 +94,100 @@ KRISP_MODEL_PATH=...
# Krisp Viva
KRISP_VIVA_MODEL_PATH=...
# DeepSeek
DEEPSEEK_API_KEY=...
# LiveKit
LIVEKIT_API_KEY=...
LIVEKIT_API_SECRET=...
# Groq
GROQ_API_KEY=...
# Grok
GROK_API_KEY=...
# Inworld
INWORLD_API_KEY=...
# Together.ai
TOGETHER_API_KEY=...
# Cerebras
CEREBRAS_API_KEY=...
# Fish Audio
FISH_API_KEY=...
# Assembly AI
ASSEMBLYAI_API_KEY=...
# OpenRouter
OPENROUTER_API_KEY=...
# Piper
PIPER_BASE_URL=...
# Smart turn
LOCAL_SMART_TURN_MODEL_PATH=...
FAL_SMART_TURN_API_KEY=...
# Twilio
TWILIO_ACCOUNT_SID=...
TWILIO_AUTH_TOKEN=...
# LMNT
LMNT_API_KEY=...
LMNT_VOICE_ID=...
# MiniMax
MINIMAX_API_KEY=...
MINIMAX_GROUP_ID=...
# Sarvam AI
SARVAM_API_KEY=...
# Soniox
SONIOX_API_KEY=
# Speechmatics
SPEECHMATICS_API_KEY=...
# SambaNova
SAMBANOVA_API_KEY=...
# Sentry
SENTRY_DSN=...
# Heygen
HEYGEN_API_KEY=...
# Mistral
MISTRAL_API_KEY=...
# Neuphonic
NEUPHONIC_API_KEY=...
# NVIDIA
NVIDIA_API_KEY=...
# OpenAI
OPENAI_API_KEY=...
# OpenPipe
OPENPIPE_API_KEY=...
# OpenRouter
OPENROUTER_API_KEY=...
# Perplexity
PERPLEXITY_API_KEY=...
# Picovoice Koala
KOALA_ACCESS_KEY=...
# Piper
PIPER_BASE_URL=...
# PlayHT
PLAYHT_USER_ID=...
PLAYHT_API_KEY=...
# Plivo
PLIVO_AUTH_ID=...
PLIVO_AUTH_TOKEN=...
# Qwen
QWEN_API_KEY=...
# Rime
RIME_API_KEY=...
RIME_VOICE_ID=...
# SambaNova
SAMBANOVA_API_KEY=...
# Sarvam AI
SARVAM_API_KEY=...
# Sentry
SENTRY_DSN=...
# Simli
SIMLI_API_KEY=...
SIMLI_FACE_ID=...
# Smart turn
LOCAL_SMART_TURN_MODEL_PATH=...
FAL_SMART_TURN_API_KEY=...
# Soniox
SONIOX_API_KEY=...
# Speechmatics
SPEECHMATICS_API_KEY=...
# Tavus
TAVUS_API_KEY=...
TAVUS_REPLICA_ID=...
# Telnyx
TELNYX_API_KEY=...
TELNYX_ACCOUNT_SID=...
# Together.ai
TOGETHER_API_KEY=...
# Twilio
TWILIO_ACCOUNT_SID=...
TWILIO_AUTH_TOKEN=...
# WhatsApp
WHATSAPP_TOKEN=
WHATSAPP_WEBHOOK_VERIFICATION_TOKEN=
WHATSAPP_PHONE_NUMBER_ID=
WHATSAPP_APP_SECRET=
WHATSAPP_TOKEN=...
WHATSAPP_WEBHOOK_VERIFICATION_TOKEN=...
WHATSAPP_PHONE_NUMBER_ID=...
WHATSAPP_APP_SECRET=...

View File

@@ -21,8 +21,8 @@ from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.stt import CartesiaSTTService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
@@ -58,7 +58,7 @@ transport_params = {
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
stt = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),

View File

@@ -67,8 +67,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
llm = AWSBedrockLLMService(
aws_region="us-west-2",
model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
params=AWSBedrockLLMService.InputParams(temperature=0.8),
)
messages = [

View File

@@ -1,147 +0,0 @@
import asyncio
import logging
import os
from typing import Tuple
import aiohttp
from dotenv import load_dotenv
from pipecat.frames.frames import AudioFrame, EndFrame, ImageFrame, LLMContextFrame, TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.processors.aggregators import SentenceAggregator
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.daily import configure
from pipecat.services.azure import AzureLLMService, AzureTTSService
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.fal import FalImageGenService
from pipecat.transports.daily.transport import DailyTransport
load_dotenv(override=True)
logging.basicConfig(format=f"%(levelno)s %(asctime)s %(message)s")
logger = logging.getLogger("pipecat")
logger.setLevel(logging.DEBUG)
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url,
None,
"Respond bot",
duration_minutes=10,
mic_enabled=True,
mic_sample_rate=16000,
camera_enabled=True,
camera_width=1024,
camera_height=1024,
)
llm = AzureLLMService(
api_key=os.getenv("AZURE_CHATGPT_API_KEY"),
endpoint=os.getenv("AZURE_CHATGPT_ENDPOINT"),
model=os.getenv("AZURE_CHATGPT_MODEL"),
)
tts1 = AzureTTSService(
api_key=os.getenv("AZURE_SPEECH_API_KEY"),
region=os.getenv("AZURE_SPEECH_REGION"),
)
tts2 = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id="jBpfuIE2acCO8z3wKNLl",
)
dalle = FalImageGenService(
params=FalImageGenService.InputParams(image_size="1024x1024"),
aiohttp_session=session,
key=os.getenv("FAL_KEY"),
)
bot1_messages = [
{
"role": "system",
"content": "You are a stern librarian. You strongly believe that a hot dog is a sandwich. Start by stating this fact in a few sentences, then be prepared to debate this with the user. You shouldn't ever compromise on the fundamental truth that a hot dog is a sandwich. Your responses should only be a few sentences long.",
},
]
bot2_messages = [
{
"role": "system",
"content": "You are a silly cat, and you strongly believe that a hot dog is not a sandwich. Debate this with the user, only responding with a few sentences. Don't ever accept that a hot dog is a sandwich.",
},
]
async def get_text_and_audio(messages) -> Tuple[str, bytearray]:
"""This function streams text from the LLM and uses the TTS service to convert
that text to speech as it's received.
"""
source_queue = asyncio.Queue()
sink_queue = asyncio.Queue()
sentence_aggregator = SentenceAggregator()
pipeline = Pipeline([llm, sentence_aggregator, tts1], source_queue, sink_queue)
await source_queue.put(LLMContextFrame(LLMContext(messages)))
await source_queue.put(EndFrame())
await pipeline.run_pipeline()
message = ""
all_audio = bytearray()
while sink_queue.qsize():
frame = sink_queue.get_nowait()
if isinstance(frame, TextFrame):
message += frame.text
elif isinstance(frame, AudioFrame):
all_audio.extend(frame.audio)
return (message, all_audio)
async def get_bot1_statement():
message, audio = await get_text_and_audio(bot1_messages)
bot1_messages.append({"role": "assistant", "content": message})
bot2_messages.append({"role": "user", "content": message})
return audio
async def get_bot2_statement():
message, audio = await get_text_and_audio(bot2_messages)
bot2_messages.append({"role": "assistant", "content": message})
bot1_messages.append({"role": "user", "content": message})
return audio
async def argue():
for i in range(100):
print(f"In iteration {i}")
bot1_description = "A woman conservatively dressed as a librarian in a library surrounded by books, cartoon, serious, highly detailed"
(audio1, image_data1) = await asyncio.gather(
get_bot1_statement(), dalle.run_image_gen(bot1_description)
)
await transport.send_queue.put(
[
ImageFrame(image_data1[1], image_data1[2]),
AudioFrame(audio1),
]
)
bot2_description = "A cat dressed in a hot dog costume, cartoon, bright colors, funny, highly detailed"
(audio2, image_data2) = await asyncio.gather(
get_bot2_statement(), dalle.run_image_gen(bot2_description)
)
await transport.send_queue.put(
[
ImageFrame(image_data2[1], image_data2[2]),
AudioFrame(audio2),
]
)
await asyncio.gather(transport.run(), argue())
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -0,0 +1,170 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import io
import os
import re
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import (
Frame,
LLMRunFrame,
MetricsFrame,
)
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
load_dotenv(override=True)
def format_metrics(metrics, indent=0):
lines = []
tab = "\t" * indent
for metric in metrics:
lines.append(tab + type(metric).__name__)
for field, value in vars(metric).items():
if hasattr(value, "__dict__") and not isinstance(
value, (str, int, float, bool, type(None))
):
lines.append(f"{tab}\t{field}={type(value).__name__}")
for k, v in vars(value).items():
lines.append(f"{tab}\t\t{k}={repr(v)}")
else:
lines.append(f"{tab}\t{field}={repr(value)}")
return "\n".join(lines)
class MetricsFrameLogger(FrameProcessor):
"""MetricsFrameLogger formats and logs all MetericsFrames"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
async def process_frame(self, frame: Frame, direction: FrameDirection):
await super().process_frame(frame, direction)
if isinstance(frame, MetricsFrame):
logger.info(f"{frame.name}\n {format_metrics(frame.data)}")
await self.push_frame(frame, direction)
# ALWAYS push all frames
else:
# SUPER IMPORTANT: always push every frame!
await self.push_frame(frame, direction)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
video_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
metrics_frame_processor = MetricsFrameLogger()
pipeline = Pipeline(
[
transport.input(),
stt,
context_aggregator.user(),
llm,
tts,
transport.output(),
context_aggregator.assistant(),
metrics_frame_processor, # pretty print metrics frames
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected: {client}")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -48,10 +48,7 @@ transport_params = {
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = CartesiaSTTService(
api_key=os.getenv("CARTESIA_API_KEY"),
base_url=os.getenv("CARTESIA_BASE_URL"),
)
stt = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))
tl = TranscriptionLogger()

View File

@@ -79,8 +79,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
llm = AWSBedrockLLMService(
aws_region="us-west-2",
model="us.anthropic.claude-3-5-haiku-20241022-v1:0",
params=AWSBedrockLLMService.InputParams(temperature=0.8, latency="optimized"),
model="us.anthropic.claude-haiku-4-5-20251001-v1:0",
params=AWSBedrockLLMService.InputParams(temperature=0.8),
)
# You can also register a function_name of None to get all functions

View File

@@ -0,0 +1,182 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import time
from dotenv import load_dotenv
from loguru import logger
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.services.openpipe.llm import OpenPipeLLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
async def fetch_weather_from_api(params: FunctionCallParams):
await params.result_callback({"conditions": "nice", "temperature": "75"})
async def fetch_restaurant_recommendation(params: FunctionCallParams):
await params.result_callback({"name": "The Golden Dragon"})
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
timestamp = int(time.time())
llm = OpenPipeLLMService(
api_key=os.getenv("OPENAI_API_KEY"),
openpipe_api_key=os.getenv("OPENPIPE_API_KEY"),
tags={"conversation_id": f"pipecat-{timestamp}"},
)
# You can also register a function_name of None to get all functions
# sent to the same callback with an additional function_name parameter.
llm.register_function("get_current_weather", fetch_weather_from_api)
llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
@llm.event_handler("on_function_calls_started")
async def on_function_calls_started(service, function_calls):
await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
weather_function = FunctionSchema(
name="get_current_weather",
description="Get the current weather",
properties={
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"format": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "The temperature unit to use. Infer this from the user's location.",
},
},
required=["location", "format"],
)
restaurant_function = FunctionSchema(
name="get_restaurant_recommendation",
description="Get a restaurant recommendation",
properties={
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
},
required=["location"],
)
tools = ToolsSchema(standard_tools=[weather_function, restaurant_function])
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages, tools)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(),
stt,
context_aggregator.user(),
llm,
tts,
transport.output(),
context_aggregator.assistant(),
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -82,25 +82,15 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
async def handle_user_idle(user_idle: UserIdleProcessor, retry_count: int) -> bool:
if retry_count == 1:
# First attempt: Add a gentle prompt to the conversation
message = {
"role": "system",
"content": "The user has been quiet. Politely and briefly ask if they're still there.",
}
await user_idle.push_frame(LLMMessagesAppendFrame([message], run_llm=True))
await user_idle.push_frame(TTSSpeakFrame("First check."))
return True
elif retry_count == 2:
# Second attempt: More direct prompt
message = {
"role": "system",
"content": "The user is still inactive. Ask if they'd like to continue our conversation.",
}
await user_idle.push_frame(LLMMessagesAppendFrame([message], run_llm=True))
await user_idle.push_frame(TTSSpeakFrame("Second check."))
return True
else:
# Third attempt: End the conversation
await user_idle.push_frame(
TTSSpeakFrame("It seems like you're busy right now. Have a nice day!")
)
await user_idle.push_frame(TTSSpeakFrame("Third check."))
await task.queue_frame(EndFrame())
return False
@@ -128,6 +118,14 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
turn_observer = task.turn_tracking_observer
if turn_observer:
@turn_observer.event_handler("on_turn_ended")
async def on_turn_ended(observer, turn_number, duration, was_interrupted):
# Log the context after every turn
logger.info(f"📝 Context after turn {turn_number}: {context.get_messages()}")
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")

View File

@@ -5,6 +5,7 @@
#
import asyncio
import os
from datetime import datetime
@@ -14,12 +15,14 @@ from loguru import logger
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
from pipecat.frames.frames import LLMRunFrame, LLMSetToolsFrame, TranscriptionMessage
from pipecat.observers.loggers.transcription_log_observer import TranscriptionLogObserver
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.transcript_processor import TranscriptProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
@@ -52,6 +55,18 @@ async def fetch_weather_from_api(params: FunctionCallParams):
)
async def get_news(params: FunctionCallParams):
await params.result_callback(
{
"news": [
"Massive UFO currently hovering above New York City",
"Stock markets reach all-time highs",
"Living dinosaur species discovered in the Amazon rainforest",
],
}
)
async def fetch_restaurant_recommendation(params: FunctionCallParams):
await params.result_callback({"name": "The Golden Dragon"})
@@ -73,6 +88,13 @@ weather_function = FunctionSchema(
required=["location", "format"],
)
get_news_function = FunctionSchema(
name="get_news",
description="Get the current news.",
properties={},
required=[],
)
restaurant_function = FunctionSchema(
name="get_restaurant_recommendation",
description="Get a restaurant recommendation",
@@ -140,10 +162,6 @@ even if you're asked about them.
You are participating in a voice conversation. Keep your responses concise, short, and to the point
unless specifically asked to elaborate on a topic.
You have access to the following tools:
- get_current_weather: Get the current weather for a given location.
- get_restaurant_recommendation: Get a restaurant recommendation for a given location.
Remember, your responses should be short. Just one or two sentences, usually. Respond in English.""",
)
@@ -157,25 +175,31 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
# llm.register_function(None, fetch_weather_from_api)
llm.register_function("get_current_weather", fetch_weather_from_api)
llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
llm.register_function("get_news", get_news)
transcript = TranscriptProcessor()
# Create a standard OpenAI LLM context object using the normal messages format. The
# OpenAIRealtimeLLMService will convert this internally to messages that the
# openai WebSocket API can understand.
context = OpenAILLMContext(
context = LLMContext(
[{"role": "user", "content": "Say hello!"}],
tools,
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when OpenAI Realtime used with
# "audio" modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
pipeline = Pipeline(
[
transport.input(), # Transport user input
context_aggregator.user(),
transcript.user(), # LLM pushes TranscriptionFrames upstream
llm, # LLM
transcript.user(), # Placed after the LLM, as LLM pushes TranscriptionFrames downstream
transport.output(), # Transport bot output
transcript.assistant(), # After the transcript output, to time with the audio output
context_aggregator.assistant(),
@@ -198,6 +222,13 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
# Kick off the conversation.
await task.queue_frames([LLMRunFrame()])
# Add a new tool at runtime after a delay.
await asyncio.sleep(15)
new_tools = ToolsSchema(
standard_tools=[weather_function, restaurant_function, get_news_function]
)
await task.queue_frames([LLMSetToolsFrame(tools=new_tools)])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")

View File

@@ -18,7 +18,9 @@ from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.azure.realtime.llm import AzureRealtimeLLMService
@@ -155,10 +157,10 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
llm.register_function("get_current_weather", fetch_weather_from_api)
llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
# Create a standard OpenAI LLM context object using the normal messages format. The
# Create a standard LLM context object using the normal messages format. The
# OpenAIRealtimeBetaLLMService will convert this internally to messages that the
# openai WebSocket API can understand.
context = OpenAILLMContext(
context = LLMContext(
[{"role": "user", "content": "Say hello!"}],
# [{"role": "user", "content": [{"type": "text", "text": "Say hello!"}]}],
# [
@@ -173,7 +175,12 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
tools,
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when OpenAI Realtime used with
# "audio" modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
pipeline = Pipeline(
[

View File

@@ -18,7 +18,8 @@ from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.transcript_processor import TranscriptProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
@@ -169,20 +170,20 @@ Remember, your responses should be short. Just one or two sentences, usually. Re
# Create a standard OpenAI LLM context object using the normal messages format. The
# OpenAIRealtimeLLMService will convert this internally to messages that the
# openai WebSocket API can understand.
context = OpenAILLMContext(
context = LLMContext(
[{"role": "user", "content": "Say hello!"}],
tools,
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
context_aggregator.user(),
transcript.user(), # LLM pushes TranscriptionFrames upstream
llm, # LLM
tts, # TTS
transcript.user(), # Placed after the LLM, as LLM pushes TranscriptionFrames downstream
transport.output(), # Transport bot output
transcript.assistant(), # After the transcript output, to time with the audio output
context_aggregator.assistant(),

View File

@@ -13,14 +13,15 @@ from datetime import datetime
from dotenv import load_dotenv
from loguru import logger
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import (
OpenAILLMContext,
)
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.deepgram.stt import DeepgramSTTService
@@ -69,11 +70,11 @@ async def save_conversation(params: FunctionCallParams):
timestamp = datetime.now().strftime("%Y-%m-%d_%H:%M:%S")
filename = f"{BASE_FILENAME}{timestamp}.json"
logger.debug(
f"writing conversation to {filename}\n{json.dumps(params.context.messages, indent=4)}"
f"writing conversation to {filename}\n{json.dumps(params.context.get_messages(), indent=4)}"
)
try:
with open(filename, "w") as file:
messages = params.context.get_messages_for_persistent_storage()
messages = params.context.get_messages()
# remove the last message, which is the instruction we just gave to save the conversation
messages.pop()
json.dump(messages, file, indent=2)
@@ -90,6 +91,10 @@ async def load_conversation(params: FunctionCallParams):
with open(filename, "r") as file:
params.context.set_messages(json.load(file))
await params.llm.reset_conversation()
# NOTE: we manually create a response here rather than relying
# on the function callback to trigger one since we've reset the
# conversation so the remote service doesn't know about the
# in-progress tool call.
await params.llm._create_response()
except Exception as e:
await params.result_callback({"success": False, "error": str(e)})
@@ -97,14 +102,12 @@ async def load_conversation(params: FunctionCallParams):
asyncio.create_task(_reset())
tools = [
{
"type": "function",
"name": "get_current_weather",
"description": "Get the current weather",
"parameters": {
"type": "object",
"properties": {
tools = ToolsSchema(
standard_tools=[
FunctionSchema(
name="get_current_weather",
description="Get the current weather",
properties={
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
@@ -115,45 +118,33 @@ tools = [
"description": "The temperature unit to use. Infer this from the users location.",
},
},
"required": ["location", "format"],
},
},
{
"type": "function",
"name": "save_conversation",
"description": "Save the current conversatione. Use this function to persist the current conversation to external storage.",
"parameters": {
"type": "object",
"properties": {},
"required": [],
},
},
{
"type": "function",
"name": "get_saved_conversation_filenames",
"description": "Get a list of saved conversation histories. Returns a list of filenames. Each filename includes a date and timestamp. Each file is conversation history that can be loaded into this session.",
"parameters": {
"type": "object",
"properties": {},
"required": [],
},
},
{
"type": "function",
"name": "load_conversation",
"description": "Load a conversation history. Use this function to load a conversation history into the current session.",
"parameters": {
"type": "object",
"properties": {
required=["location", "format"],
),
FunctionSchema(
name="save_conversation",
description="Save the current conversatione. Use this function to persist the current conversation to external storage.",
properties={},
required=[],
),
FunctionSchema(
name="get_saved_conversation_filenames",
description="Get a list of saved conversation histories. Returns a list of filenames. Each filename includes a date and timestamp. Each file is conversation history that can be loaded into this session.",
properties={},
required=[],
),
FunctionSchema(
name="load_conversation",
description="Load a conversation history. Use this function to load a conversation history into the current session.",
properties={
"filename": {
"type": "string",
"description": "The filename of the conversation history to load.",
}
},
"required": ["filename"],
},
},
]
required=["filename"],
),
]
)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
@@ -224,8 +215,8 @@ Remember, your responses should be short. Just one or two sentences, usually."""
llm.register_function("get_saved_conversation_filenames", get_saved_conversation_filenames)
llm.register_function("load_conversation", load_conversation)
context = OpenAILLMContext([], tools)
context_aggregator = llm.create_context_aggregator(context)
context = LLMContext([{"role": "user", "content": "Say hello!"}], tools)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[

View File

@@ -72,7 +72,6 @@ async def save_conversation(params: FunctionCallParams):
)
try:
with open(filename, "w") as file:
# todo: extract 'system' into the first message in the list
messages = params.context.get_messages()
# remove the last message, which is the instruction we just gave to save the conversation
messages.pop()

View File

@@ -90,7 +90,6 @@ async def save_conversation(params: FunctionCallParams):
)
try:
with open(filename, "w") as file:
# todo: extract 'system' into the first message in the list
messages = params.context.get_messages()
# remove the last message (the instruction to save the context)
messages.pop()

View File

@@ -20,6 +20,8 @@ from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
@@ -75,7 +77,7 @@ async def save_conversation(params: FunctionCallParams):
filename = f"{BASE_FILENAME}{timestamp}.json"
try:
with open(filename, "w") as file:
messages = params.context.get_messages_for_persistent_storage()
messages = params.context.get_messages()
# remove the last few messages. in reverse order, they are:
# - the in progress save tool call
# - the invocation of the save tool call
@@ -223,13 +225,13 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
llm.register_function("get_saved_conversation_filenames", get_saved_conversation_filenames)
llm.register_function("load_conversation", load_conversation)
context = OpenAILLMContext(
context = LLMContext(
messages=[
{"role": "system", "content": f"{system_instruction}"},
],
tools=tools,
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[

View File

@@ -16,7 +16,9 @@ from pipecat.frames.frames import LLMRunFrame, TranscriptionMessage
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.transcript_processor import TranscriptProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
@@ -72,7 +74,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# inference_on_context_initialization=False,
)
context = OpenAILLMContext(
context = LLMContext(
[
{
"role": "user",
@@ -90,7 +92,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# },
],
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
transcript = TranscriptProcessor()

View File

@@ -19,7 +19,9 @@ from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
@@ -139,10 +141,15 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
llm.register_function("get_current_weather", fetch_weather_from_api)
llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
context = OpenAILLMContext(
context = LLMContext(
[{"role": "user", "content": "Say hello."}],
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
pipeline = Pipeline(
[

View File

@@ -17,7 +17,9 @@ from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import (
create_transport,
@@ -65,7 +67,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# inference_on_context_initialization=False,
)
context = OpenAILLMContext(
context = LLMContext(
[
{
"role": "user",
@@ -73,7 +75,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
},
],
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
pipeline = Pipeline(
[

View File

@@ -16,7 +16,8 @@ from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
@@ -109,8 +110,8 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
# Set up conversation context and management
# The context_aggregator will automatically collect conversation context
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[

View File

@@ -16,7 +16,9 @@ from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
@@ -90,7 +92,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
tools=tools,
)
context = OpenAILLMContext(
context = LLMContext(
[
{
"role": "user",
@@ -98,7 +100,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
}
],
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
pipeline = Pipeline(
[

View File

@@ -16,7 +16,9 @@ from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
@@ -129,7 +131,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
mime_type = "text/plain"
# Create context with file reference
context = OpenAILLMContext(
context = LLMContext(
[
{
"role": "user",
@@ -152,7 +154,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
except Exception as e:
logger.error(f"Error uploading file: {e}")
# Continue with a basic context if file upload fails
context = OpenAILLMContext(
context = LLMContext(
[
{
"role": "user",
@@ -162,7 +164,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
)
# Create context aggregator
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
# Build the pipeline
pipeline = Pipeline(

View File

@@ -10,7 +10,9 @@ from pipecat.frames.frames import Frame, LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
@@ -124,8 +126,13 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
]
# Set up conversation context and management
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
pipeline = Pipeline(
[

View File

@@ -9,21 +9,21 @@ import os
from datetime import datetime
from dotenv import load_dotenv
from google.genai.types import HttpOptions
from loguru import logger
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.google.gemini_live.llm import GeminiLiveLLMService
from pipecat.services.google.gemini_live.llm_vertex import GeminiLiveVertexLLMService
from pipecat.services.llm_service import FunctionCallParams
from pipecat.transports.base_transport import BaseTransport, TransportParams
@@ -139,10 +139,13 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
llm.register_function("get_current_weather", fetch_weather_from_api)
llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
context = OpenAILLMContext(
[{"role": "user", "content": "Say hello."}],
context = LLMContext([{"role": "user", "content": "Say hello."}])
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[

View File

@@ -18,7 +18,9 @@ from pipecat.frames.frames import EndTaskFrame, LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.frame_processor import FrameDirection
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
@@ -62,7 +64,7 @@ You have three tools available to you:
After you've responded to the user three times, do two things, in order:
1. Politely let them know that that's all the time you have today and say goodbye.
2. Call the end_conversation tool to gracefully end the conversation.
2. *WITHOUT WAITING FOR THE USER TO RESPOND*, call the end_conversation tool to gracefully end the conversation.
"""
@@ -152,10 +154,15 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
llm.register_function("get_restaurant_recommendation", fetch_restaurant_recommendation)
llm.register_function("end_conversation", end_conversation)
context = OpenAILLMContext(
context = LLMContext(
[{"role": "user", "content": "Say hello."}],
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
pipeline = Pipeline(
[

View File

@@ -9,7 +9,6 @@ import os
from dotenv import load_dotenv
from loguru import logger
from simli import SimliConfig
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
@@ -66,11 +65,12 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="a167e0f3-df7e-4d52-a9c3-f949145efdab",
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",
)
simli_ai = SimliVideoService(
SimliConfig(os.getenv("SIMLI_API_KEY"), os.getenv("SIMLI_FACE_ID")),
api_key=os.getenv("SIMLI_API_KEY"),
face_id="cace3ef7-a4c4-425d-a8cf-a5358eb0c427",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"), model="gpt-4o-mini")

View File

@@ -18,7 +18,8 @@ from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.aws.nova_sonic.llm import AWSNovaSonicLLMService
@@ -119,9 +120,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
llm.register_function("get_current_weather", fetch_weather_from_api)
# Set up context and context management.
# AWSNovaSonicService will adapt OpenAI LLM context objects with standard message format to
# what's expected by Nova Sonic.
context = OpenAILLMContext(
context = LLMContext(
messages=[
{"role": "system", "content": f"{system_instruction}"},
{
@@ -131,7 +130,7 @@ async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
],
tools=tools,
)
context_aggregator = llm.create_context_aggregator(context)
context_aggregator = LLMContextAggregatorPair(context)
# Build the pipeline
pipeline = Pipeline(

View File

@@ -15,7 +15,9 @@ from pipecat.frames.frames import Frame, InputImageRawFrame, LLMRunFrame, Output
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import LLMAssistantAggregatorParams
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
from pipecat.processors.frameworks.rtvi import RTVIObserver, RTVIProcessor
from pipecat.runner.types import RunnerArguments
@@ -108,8 +110,13 @@ async def run_bot(pipecat_transport):
}
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(
context,
# `expect_stripped_words=False` needed when Gemini Live used with AUDIO
# modality (the default)
assistant_params=LLMAssistantAggregatorParams(expect_stripped_words=False),
)
# RTVI events for Pipecat client UI
rtvi = RTVIProcessor()

View File

@@ -0,0 +1,142 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import sentry_sdk
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.metrics.sentry import SentryMetrics
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
# Initialize Sentry
sentry_sdk.init(
dsn=os.getenv("SENTRY_DSN"),
traces_sample_rate=1.0,
)
stt = DeepgramSTTService(
api_key=os.getenv("DEEPGRAM_API_KEY"),
metrics=SentryMetrics(),
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
metrics=SentryMetrics(),
)
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
metrics=SentryMetrics(),
)
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt,
context_aggregator.user(), # User responses
llm, # LLM
tts, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

View File

@@ -0,0 +1,153 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.turn.smart_turn.base_smart_turn import SmartTurnParams
from pipecat.audio.turn.smart_turn.local_smart_turn_v3 import LocalSmartTurnAnalyzerV3
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams
from pipecat.frames.frames import LLMRunFrame, ManuallySwitchServiceFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.service_switcher import ServiceSwitcher, ServiceSwitcherStrategyManual
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.runner.types import RunnerArguments
from pipecat.runner.utils import create_transport
from pipecat.services.cartesia.stt import CartesiaSTTService
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.services.deepgram.stt import DeepgramSTTService
from pipecat.services.deepgram.tts import DeepgramTTSService
from pipecat.services.google.llm import GoogleLLMService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.base_transport import BaseTransport, TransportParams
from pipecat.transports.daily.transport import DailyParams
from pipecat.transports.websocket.fastapi import FastAPIWebsocketParams
load_dotenv(override=True)
# We store functions so objects (e.g. SileroVADAnalyzer) don't get
# instantiated. The function will be called when the desired transport gets
# selected.
transport_params = {
"daily": lambda: DailyParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"twilio": lambda: FastAPIWebsocketParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
"webrtc": lambda: TransportParams(
audio_in_enabled=True,
audio_out_enabled=True,
vad_analyzer=SileroVADAnalyzer(params=VADParams(stop_secs=0.2)),
turn_analyzer=LocalSmartTurnAnalyzerV3(params=SmartTurnParams()),
),
}
async def run_bot(transport: BaseTransport, runner_args: RunnerArguments):
logger.info(f"Starting bot")
stt_cartesia = CartesiaSTTService(api_key=os.getenv("CARTESIA_API_KEY"))
stt_deepgram = DeepgramSTTService(api_key=os.getenv("DEEPGRAM_API_KEY"))
stt_switcher = ServiceSwitcher(
services=[stt_cartesia, stt_deepgram], strategy_type=ServiceSwitcherStrategyManual
)
tts_cartesia = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121",
)
tts_deepgram = DeepgramTTSService(api_key=os.getenv("DEEPGRAM_API_KEY"))
tts_switcher = ServiceSwitcher(
services=[tts_cartesia, tts_deepgram], strategy_type=ServiceSwitcherStrategyManual
)
llm_openai = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
llm_google = GoogleLLMService(api_key=os.getenv("GOOGLE_API_KEY"))
llm_switcher = ServiceSwitcher(
services=[llm_openai, llm_google], strategy_type=ServiceSwitcherStrategyManual
)
messages = [
{
"role": "system",
"content": "You are a helpful LLM in a WebRTC call. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way.",
},
]
context = LLMContext(messages)
context_aggregator = LLMContextAggregatorPair(context)
pipeline = Pipeline(
[
transport.input(), # Transport user input
stt_switcher,
context_aggregator.user(), # User responses
llm_switcher, # LLM
tts_switcher, # TTS
transport.output(), # Transport bot output
context_aggregator.assistant(), # Assistant spoken responses
]
)
task = PipelineTask(
pipeline,
params=PipelineParams(
enable_metrics=True,
enable_usage_metrics=True,
),
idle_timeout_secs=runner_args.pipeline_idle_timeout_secs,
)
@transport.event_handler("on_client_connected")
async def on_client_connected(transport, client):
logger.info(f"Client connected")
# Kick off the conversation.
messages.append({"role": "system", "content": "Please introduce yourself to the user."})
await task.queue_frames([LLMRunFrame()])
await asyncio.sleep(15)
print(f"Switching to {stt_deepgram}")
await task.queue_frames([ManuallySwitchServiceFrame(service=stt_deepgram)])
await asyncio.sleep(15)
print(f"Switching to {llm_google}")
await task.queue_frames([ManuallySwitchServiceFrame(service=llm_google)])
await asyncio.sleep(15)
print(f"Switching to {tts_deepgram}")
await task.queue_frames([ManuallySwitchServiceFrame(service=tts_deepgram)])
@transport.event_handler("on_client_disconnected")
async def on_client_disconnected(transport, client):
logger.info(f"Client disconnected")
await task.cancel()
runner = PipelineRunner(handle_sigint=runner_args.handle_sigint)
await runner.run(task)
async def bot(runner_args: RunnerArguments):
"""Main bot entry point compatible with Pipecat Cloud."""
transport = await create_transport(runner_args, transport_params)
await run_bot(transport, runner_args)
if __name__ == "__main__":
from pipecat.runner.run import main
main()

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

View File

@@ -73,13 +73,13 @@ Transform your local bot into a production-ready service. Pipecat Cloud handles
1. [Sign up for Pipecat Cloud](https://pipecat.daily.co/sign-up).
2. Install the Pipecat Cloud CLI:
2. Install the Pipecat CLI:
```bash
uv add pipecatcloud
uv tool install pipecat-ai-cli
```
> 💡 Tip: You can run the `pipecatcloud` CLI using the `pcc` alias.
> 💡 Tip: You can run the `pipecat` CLI using the `pc` alias.
3. Set up Docker for building your bot image:
@@ -113,12 +113,22 @@ secret_set = "quickstart-secrets"
> 💡 Tip: [Set up `image_credentials`](https://docs.pipecat.ai/deployment/pipecat-cloud/fundamentals/secrets#image-pull-secrets) in your TOML file for authenticated image pulls
### Log in to Pipecat Cloud
To start using the CLI, authenticate to Pipecat Cloud:
```bash
pipecat cloud auth login
```
You'll be presented with a link that you can click to authenticate your client.
### Configure secrets
Upload your API keys to Pipecat Cloud's secure storage:
```bash
uv run pcc secrets set quickstart-secrets --file .env
pipecat cloud secrets set quickstart-secrets --file .env
```
This creates a secret set called `quickstart-secrets` (matching your TOML file) and uploads all your API keys from `.env`.
@@ -128,13 +138,13 @@ This creates a secret set called `quickstart-secrets` (matching your TOML file)
Build your Docker image and push to Docker Hub:
```bash
uv run pcc docker build-push
pipecat cloud docker build-push
```
Deploy to Pipecat Cloud:
```bash
uv run pcc deploy
pipecat cloud deploy
```
### Connect to your agent

View File

@@ -1,6 +1,11 @@
agent_name = "quickstart"
image = "your_username/quickstart:0.1"
secret_set = "quickstart-secrets"
agent_profile = "agent-1x"
# RECOMMENDED: Set an image pull secret:
# https://docs.pipecat.ai/deployment/pipecat-cloud/fundamentals/secrets#image-pull-secrets
# image_credentials = "your_image_pull_secret"
[scaling]
min_agents = 1

View File

@@ -4,13 +4,14 @@ version = "0.1.0"
description = "Quickstart example for building voice AI bots with Pipecat"
requires-python = ">=3.10"
dependencies = [
"pipecat-ai[webrtc,daily,silero,deepgram,openai,cartesia,local-smart-turn-v3,runner]>=0.0.86",
"pipecatcloud>=0.2.4"
"pipecat-ai[webrtc,daily,silero,deepgram,openai,cartesia,local-smart-turn-v3,runner]",
"pipecat-ai-cli"
]
[dependency-groups]
dev = [
"ruff~=0.12.1",
"pyright>=1.1.404,<2",
"ruff>=0.12.11,<1",
]
[tool.ruff]

View File

@@ -34,7 +34,7 @@ dependencies = [
"pyloudnorm~=0.1.1",
"resampy~=0.4.3",
"soxr~=0.5.0",
"openai>=1.74.0,<=1.99.1",
"openai>=1.74.0,<3",
# Pinning numba to resolve package dependencies
"numba==0.61.2",
"wait_for2>=0.4.1; python_version<'3.12'",
@@ -50,12 +50,12 @@ anthropic = [ "anthropic~=0.49.0" ]
assemblyai = [ "pipecat-ai[websockets-base]" ]
asyncai = [ "pipecat-ai[websockets-base]" ]
aws = [ "aioboto3~=15.0.0", "pipecat-ai[websockets-base]" ]
aws-nova-sonic = [ "aws_sdk_bedrock_runtime~=0.1.0; python_version>='3.12'" ]
aws-nova-sonic = [ "aws_sdk_bedrock_runtime~=0.1.1; python_version>='3.12'" ]
azure = [ "azure-cognitiveservices-speech~=1.42.0"]
cartesia = [ "cartesia~=2.0.3", "pipecat-ai[websockets-base]" ]
cerebras = []
deepseek = []
daily = [ "daily-python~=0.19.9" ]
daily = [ "daily-python~=0.21.0" ]
deepgram = [ "deepgram-sdk~=4.7.0" ]
elevenlabs = [ "pipecat-ai[websockets-base]" ]
fal = [ "fal-client~=0.5.9" ]
@@ -84,7 +84,7 @@ nim = []
neuphonic = [ "pipecat-ai[websockets-base]" ]
noisereduce = [ "noisereduce~=3.0.3" ]
openai = [ "pipecat-ai[websockets-base]" ]
openpipe = [ "openpipe~=4.50.0" ]
openpipe = [ "openpipe>=4.50.0,<6" ]
openrouter = []
perplexity = []
playht = [ "pipecat-ai[websockets-base]" ]
@@ -102,7 +102,7 @@ silero = [ "onnxruntime>=1.20.1,<2" ]
simli = [ "simli-ai~=0.1.10"]
soniox = [ "pipecat-ai[websockets-base]" ]
soundfile = [ "soundfile~=0.13.0" ]
speechmatics = [ "speechmatics-rt>=0.4.0" ]
speechmatics = [ "speechmatics-rt>=0.5.0" ]
strands = [ "strands-agents>=1.9.1,<2" ]
tavus=[]
together = []

View File

@@ -136,6 +136,7 @@ TESTS_14 = [
("14r-function-calling-aws.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
("14v-function-calling-openai.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
("14w-function-calling-mistral.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
("14x-function-calling-openpipe.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
# Currently not working.
# ("14c-function-calling-together.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),
# ("14l-function-calling-deepseek.py", PROMPT_WEATHER, EVAL_WEATHER, BOT_SPEAKS_FIRST),

View File

@@ -22,9 +22,12 @@ class AdapterType(Enum):
Parameters:
GEMINI: Google Gemini adapter - currently the only service supporting custom tools.
SHIM: Backward compatibility shim for creating ToolsSchemas from lists of tools in
any format, used by LLMContext.from_openai_context.
"""
GEMINI = "gemini" # that is the only service where we are able to add custom tools for now
SHIM = "shim" # for use as backward compatibility shim for creating ToolsSchemas from list of tools in any format
class ToolsSchema:

View File

@@ -110,7 +110,7 @@ class AnthropicLLMAdapter(BaseLLMAdapter[AnthropicLLMInvocationParams]):
system = NOT_GIVEN
messages = []
# first, map messages using self._from_universal_context_message(m)
# First, map messages using self._from_universal_context_message(m)
try:
messages = [self._from_universal_context_message(m) for m in universal_context_messages]
except Exception as e:

View File

@@ -6,13 +6,47 @@
"""AWS Nova Sonic LLM adapter for Pipecat."""
import copy
import json
from typing import Any, Dict, List, TypedDict
from dataclasses import dataclass
from enum import Enum
from typing import Any, Dict, List, Optional, TypedDict
from loguru import logger
from pipecat.adapters.base_llm_adapter import BaseLLMAdapter
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
from pipecat.processors.aggregators.llm_context import LLMContext, LLMContextMessage
class Role(Enum):
"""Roles supported in AWS Nova Sonic conversations.
Parameters:
SYSTEM: System-level messages (not used in conversation history).
USER: Messages sent by the user.
ASSISTANT: Messages sent by the assistant.
TOOL: Messages sent by tools (not used in conversation history).
"""
SYSTEM = "SYSTEM"
USER = "USER"
ASSISTANT = "ASSISTANT"
TOOL = "TOOL"
@dataclass
class AWSNovaSonicConversationHistoryMessage:
"""A single message in AWS Nova Sonic conversation history.
Parameters:
role: The role of the message sender (USER or ASSISTANT only).
text: The text content of the message.
"""
role: Role # only USER and ASSISTANT
text: str
class AWSNovaSonicLLMInvocationParams(TypedDict):
@@ -21,7 +55,9 @@ class AWSNovaSonicLLMInvocationParams(TypedDict):
This is a placeholder until support for universal LLMContext machinery is added for AWS Nova Sonic.
"""
pass
system_instruction: Optional[str]
messages: List[AWSNovaSonicConversationHistoryMessage]
tools: List[Dict[str, Any]]
class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
@@ -34,7 +70,7 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
@property
def id_for_llm_specific_messages(self) -> str:
"""Get the identifier used in LLMSpecificMessage instances for AWS Nova Sonic."""
raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")
return "aws-nova-sonic"
def get_llm_invocation_params(self, context: LLMContext) -> AWSNovaSonicLLMInvocationParams:
"""Get AWS Nova Sonic-specific LLM invocation parameters from a universal LLM context.
@@ -47,7 +83,13 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
Returns:
Dictionary of parameters for invoking AWS Nova Sonic's LLM API.
"""
raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")
messages = self._from_universal_context_messages(self.get_messages(context))
return {
"system_instruction": messages.system_instruction,
"messages": messages.messages,
# NOTE: LLMContext's tools are guaranteed to be a ToolsSchema (or NOT_GIVEN)
"tools": self.from_standard_tools(context.tools) or [],
}
def get_messages_for_logging(self, context) -> List[Dict[str, Any]]:
"""Get messages from a universal LLM context in a format ready for logging about AWS Nova Sonic.
@@ -62,7 +104,75 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
Returns:
List of messages in a format ready for logging about AWS Nova Sonic.
"""
raise NotImplementedError("Universal LLMContext is not yet supported for AWS Nova Sonic.")
return self._from_universal_context_messages(self.get_messages(context)).messages
@dataclass
class ConvertedMessages:
"""Container for Google-formatted messages converted from universal context."""
messages: List[AWSNovaSonicConversationHistoryMessage]
system_instruction: Optional[str] = None
def _from_universal_context_messages(
self, universal_context_messages: List[LLMContextMessage]
) -> ConvertedMessages:
system_instruction = None
messages = []
# Bail if there are no messages
if not universal_context_messages:
return self.ConvertedMessages()
universal_context_messages = copy.deepcopy(universal_context_messages)
# If we have a "system" message as our first message, let's pull that out into "instruction"
if universal_context_messages[0].get("role") == "system":
system = universal_context_messages.pop(0)
content = system.get("content")
if isinstance(content, str):
system_instruction = content
elif isinstance(content, list):
system_instruction = content[0].get("text")
if system_instruction:
self._system_instruction = system_instruction
# Process remaining messages to fill out conversation history.
# Nova Sonic supports "user" and "assistant" messages in history.
for universal_context_message in universal_context_messages:
message = self._from_universal_context_message(universal_context_message)
if message:
messages.append(message)
return self.ConvertedMessages(messages=messages, system_instruction=system_instruction)
def _from_universal_context_message(self, message) -> AWSNovaSonicConversationHistoryMessage:
"""Convert standard message format to Nova Sonic format.
Args:
message: Standard message dictionary to convert.
Returns:
Nova Sonic conversation history message, or None if not convertible.
"""
role = message.get("role")
if message.get("role") == "user" or message.get("role") == "assistant":
content = message.get("content")
if isinstance(message.get("content"), list):
content = ""
for c in message.get("content"):
if c.get("type") == "text":
content += " " + c.get("text")
else:
logger.error(
f"Unhandled content type in context message: {c.get('type')} - {message}"
)
# There won't be content if this is an assistant tool call entry.
# We're ignoring those since they can't be loaded into AWS Nova Sonic conversation
# history
if content:
return AWSNovaSonicConversationHistoryMessage(role=Role[role.upper()], text=content)
# NOTE: we're ignoring messages with role "tool" since they can't be loaded into AWS Nova
# Sonic conversation history
@staticmethod
def _to_aws_nova_sonic_function_format(function: FunctionSchema) -> Dict[str, Any]:
@@ -100,4 +210,18 @@ class AWSNovaSonicLLMAdapter(BaseLLMAdapter[AWSNovaSonicLLMInvocationParams]):
List of dictionaries in AWS Nova Sonic function format.
"""
functions_schema = tools_schema.standard_tools
return [self._to_aws_nova_sonic_function_format(func) for func in functions_schema]
standard_tools = [
self._to_aws_nova_sonic_function_format(func) for func in functions_schema
]
# For backward compatibility, AWS Nova Sonic can still be used with
# tools in dict format, even though it always uses `LLMContext` under
# the hood (via `LLMContext.from_openai_context()`).
# To support this behavior, we use "shimmed" custom tools here.
# (We maintain this backward compatibility because users aren't
# *knowingly* opting into the new `LLMContext`.)
shimmed_tools = []
if tools_schema.custom_tools:
shimmed_tools = tools_schema.custom_tools.get(AdapterType.SHIM, [])
return standard_tools + shimmed_tools

View File

@@ -107,7 +107,7 @@ class AWSBedrockLLMAdapter(BaseLLMAdapter[AWSBedrockLLMInvocationParams]):
system = None
messages = []
# first, map messages using self._from_universal_context_message(m)
# First, map messages using self._from_universal_context_message(m)
try:
messages = [self._from_universal_context_message(m) for m in universal_context_messages]
except Exception as e:

View File

@@ -8,8 +8,8 @@
import base64
import json
from dataclasses import dataclass
from typing import Any, Dict, List, Optional, TypedDict
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional, Tuple, TypedDict
from loguru import logger
from openai import NotGiven
@@ -24,13 +24,7 @@ from pipecat.processors.aggregators.llm_context import (
)
try:
from google.genai.types import (
Blob,
Content,
FunctionCall,
FunctionResponse,
Part,
)
from google.genai.types import Blob, Content, FileData, FunctionCall, FunctionResponse, Part
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use Google AI, you need to `pip install pipecat-ai[google]`.")
@@ -133,6 +127,28 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
messages: List[Content]
system_instruction: Optional[str] = None
@dataclass
class MessageConversionResult:
"""Result of converting a single universal context message to Google format.
Either content (a Google Content object) or a system instruction string
is guaranteed to be set.
Also returns a tool call ID to name mapping for any tool calls
discovered in the message.
"""
content: Optional[Content] = None
system_instruction: Optional[str] = None
tool_call_id_to_name_mapping: Dict[str, str] = field(default_factory=dict)
@dataclass
class MessageConversionParams:
"""Parameters for converting a single universal context message to Google format."""
already_have_system_instruction: bool
tool_call_id_to_name_mapping: Dict[str, str]
def _from_universal_context_messages(
self, universal_context_messages: List[LLMContextMessage]
) -> ConvertedMessages:
@@ -156,24 +172,26 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
"""
system_instruction = None
messages = []
tool_call_id_to_name_mapping = {}
# Process each message, preserving Google-formatted messages and converting others
for message in universal_context_messages:
if isinstance(message, LLMSpecificMessage):
# Assume that LLMSpecificMessage wraps a message in Google format
messages.append(message.message)
continue
# Convert standard format to Google format
converted = self._from_standard_message(
message, already_have_system_instruction=bool(system_instruction)
result = self._from_universal_context_message(
message,
params=self.MessageConversionParams(
already_have_system_instruction=bool(system_instruction),
tool_call_id_to_name_mapping=tool_call_id_to_name_mapping,
),
)
if isinstance(converted, Content):
# Regular (non-system) message
messages.append(converted)
else:
# System instruction
system_instruction = converted
# Each result is either a Content or a system instruction
if result.content:
messages.append(result.content)
elif result.system_instruction:
system_instruction = result.system_instruction
# Merge tool call ID to name mapping
if result.tool_call_id_to_name_mapping:
tool_call_id_to_name_mapping.update(result.tool_call_id_to_name_mapping)
# Check if we only have function-related messages (no regular text)
has_regular_messages = any(
@@ -193,9 +211,16 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
return self.ConvertedMessages(messages=messages, system_instruction=system_instruction)
def _from_universal_context_message(
self, message: LLMContextMessage, *, params: MessageConversionParams
) -> MessageConversionResult:
if isinstance(message, LLMSpecificMessage):
return self.MessageConversionResult(content=message.message)
return self._from_standard_message(message, params=params)
def _from_standard_message(
self, message: LLMStandardMessage, already_have_system_instruction: bool
) -> Content | str:
self, message: LLMStandardMessage, *, params: MessageConversionParams
) -> MessageConversionResult:
"""Convert standard universal context message to Google Content object.
Handles conversion of text, images, and function calls to Google's
@@ -205,10 +230,11 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
Args:
message: Message in standard universal context format.
already_have_system_instruction: Whether we already have a system instruction
params: Parameters for conversion.
Returns:
Content object with role and parts, or a plain string for system
messages.
MessageConversionResult containing either a Content object or a
system instruction string.
Examples:
Standard text message::
@@ -242,38 +268,49 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
Converts to Google Content with::
Content(
role="model",
role="user",
parts=[Part(function_call=FunctionCall(name="search", args={"query": "test"}))]
)
"""
role = message["role"]
content = message.get("content", [])
if role == "system":
if already_have_system_instruction:
if params.already_have_system_instruction:
role = "user" # Convert system message to user role if we already have a system instruction
else:
# System instructions are returned as plain text
system_instruction: str = None
if isinstance(content, str):
return content
system_instruction = content
elif isinstance(content, list):
# If content is a list, we assume it's a list of text parts, per the standard
return " ".join(part["text"] for part in content if part.get("type") == "text")
system_instruction = " ".join(
part["text"] for part in content if part.get("type") == "text"
)
if system_instruction:
return self.MessageConversionResult(system_instruction=system_instruction)
elif role == "assistant":
role = "model"
parts = []
tool_call_id_to_name_mapping = {}
if message.get("tool_calls"):
for tc in message["tool_calls"]:
id = tc["id"]
name = tc["function"]["name"]
tool_call_id_to_name_mapping[id] = name
parts.append(
Part(
function_call=FunctionCall(
name=tc["function"]["name"],
id=id,
name=name,
args=json.loads(tc["function"]["arguments"]),
)
)
)
elif role == "tool":
role = "model"
role = "user"
try:
response = json.loads(message["content"])
if isinstance(response, dict):
@@ -284,10 +321,18 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
# Response might not be JSON-deserializable.
# This occurs with a UserImageFrame, for example, where we get a plain "COMPLETED" string.
response_dict = {"value": message["content"]}
# Get function name from mapping using tool_call_id, or fallback
tool_call_id = message.get("tool_call_id")
function_name = "tool_call_result" # Default fallback
if tool_call_id and tool_call_id in params.tool_call_id_to_name_mapping:
function_name = params.tool_call_id_to_name_mapping[tool_call_id]
parts.append(
Part(
function_response=FunctionResponse(
name="tool_call_result", # seems to work to hard-code the same name every time
id=tool_call_id,
name=function_name,
response=response_dict,
)
)
@@ -311,5 +356,18 @@ class GeminiLLMAdapter(BaseLLMAdapter[GeminiLLMInvocationParams]):
input_audio = c["input_audio"]
audio_bytes = base64.b64decode(input_audio["data"])
parts.append(Part(inline_data=Blob(mime_type="audio/wav", data=audio_bytes)))
elif c["type"] == "file_data":
file_data = c["file_data"]
parts.append(
Part(
file_data=FileData(
mime_type=file_data.get("mime_type"),
file_uri=file_data.get("file_uri"),
)
)
)
return Content(role=role, parts=parts)
return self.MessageConversionResult(
content=Content(role=role, parts=parts),
tool_call_id_to_name_mapping=tool_call_id_to_name_mapping,
)

View File

@@ -6,12 +6,18 @@
"""OpenAI Realtime LLM adapter for Pipecat."""
from typing import Any, Dict, List, TypedDict
import copy
import json
from dataclasses import dataclass
from typing import Any, Dict, List, Optional, TypedDict
from loguru import logger
from pipecat.adapters.base_llm_adapter import BaseLLMAdapter
from pipecat.adapters.schemas.function_schema import FunctionSchema
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
from pipecat.processors.aggregators.llm_context import LLMContext, LLMContextMessage
from pipecat.services.openai.realtime import events
class OpenAIRealtimeLLMInvocationParams(TypedDict):
@@ -20,7 +26,9 @@ class OpenAIRealtimeLLMInvocationParams(TypedDict):
This is a placeholder until support for universal LLMContext machinery is added for OpenAI Realtime.
"""
pass
system_instruction: Optional[str]
messages: List[events.ConversationItem]
tools: List[Dict[str, Any]]
class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
@@ -33,7 +41,7 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
@property
def id_for_llm_specific_messages(self) -> str:
"""Get the identifier used in LLMSpecificMessage instances for OpenAI Realtime."""
raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")
return "openai-realtime"
def get_llm_invocation_params(self, context: LLMContext) -> OpenAIRealtimeLLMInvocationParams:
"""Get OpenAI Realtime-specific LLM invocation parameters from a universal LLM context.
@@ -46,7 +54,13 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
Returns:
Dictionary of parameters for invoking OpenAI Realtime's API.
"""
raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")
messages = self._from_universal_context_messages(self.get_messages(context))
return {
"system_instruction": messages.system_instruction,
"messages": messages.messages,
# NOTE: LLMContext's tools are guaranteed to be a ToolsSchema (or NOT_GIVEN)
"tools": self.from_standard_tools(context.tools) or [],
}
def get_messages_for_logging(self, context) -> List[Dict[str, Any]]:
"""Get messages from a universal LLM context in a format ready for logging about OpenAI Realtime.
@@ -61,7 +75,124 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
Returns:
List of messages in a format ready for logging about OpenAI Realtime.
"""
raise NotImplementedError("Universal LLMContext is not yet supported for OpenAI Realtime.")
# NOTE: this is the same as in OpenAIAdapter, as that's what it was
# prior to a refactor. Worth noting that for OpenAI Realtime
# specifically, not everything handled here is necessarily supported
# (or supported yet).
msgs = []
for message in self.get_messages(context):
msg = copy.deepcopy(message)
if "content" in msg:
if isinstance(msg["content"], list):
for item in msg["content"]:
if item["type"] == "image_url":
if item["image_url"]["url"].startswith("data:image/"):
item["image_url"]["url"] = "data:image/..."
if item["type"] == "input_audio":
item["input_audio"]["data"] = "..."
if "mime_type" in msg and msg["mime_type"].startswith("image/"):
msg["data"] = "..."
msgs.append(msg)
return msgs
@dataclass
class ConvertedMessages:
"""Container for OpenAI-formatted messages converted from universal context."""
messages: List[events.ConversationItem]
system_instruction: Optional[str] = None
def _from_universal_context_messages(
self, universal_context_messages: List[LLMContextMessage]
) -> ConvertedMessages:
# We can't load a long conversation history into the openai realtime api yet. (The API/model
# forgets that it can do audio, if you do a series of `conversation.item.create` calls.) So
# our general strategy until this is fixed is just to put everything into a first "user"
# message as a single input.
if not universal_context_messages:
return self.ConvertedMessages(messages=[])
messages = copy.deepcopy(universal_context_messages)
system_instruction = None
# If we have a "system" message as our first message, let's pull that out into session
# "instructions"
if messages[0].get("role") == "system":
system = messages.pop(0)
content = system.get("content")
if isinstance(content, str):
system_instruction = content
elif isinstance(content, list):
system_instruction = content[0].get("text")
if not messages:
return self.ConvertedMessages(messages=[], system_instruction=system_instruction)
# If we have just a single "user" item, we can just send it normally
if len(messages) == 1 and messages[0].get("role") == "user":
return self.ConvertedMessages(
messages=[self._from_universal_context_message(messages[0])],
system_instruction=system_instruction,
)
# Otherwise, let's pack everything into a single "user" message with a bit of
# explanation for the LLM
intro_text = """
This is a previously saved conversation. Please treat this conversation history as a
starting point for the current conversation."""
trailing_text = """
This is the end of the previously saved conversation. Please continue the conversation
from here. If the last message is a user instruction or question, act on that instruction
or answer the question. If the last message is an assistant response, simple say that you
are ready to continue the conversation."""
return self.ConvertedMessages(
messages=[
{
"role": "user",
"type": "message",
"content": [
{
"type": "input_text",
"text": "\n\n".join(
[intro_text, json.dumps(messages, indent=2), trailing_text]
),
}
],
}
],
system_instruction=system_instruction,
)
def _from_universal_context_message(
self, message: LLMContextMessage
) -> events.ConversationItem:
if message.get("role") == "user":
content = message.get("content")
if isinstance(message.get("content"), list):
content = ""
for c in message.get("content"):
if c.get("type") == "text":
content += " " + c.get("text")
else:
logger.error(
f"Unhandled content type in context message: {c.get('type')} - {message}"
)
return events.ConversationItem(
role="user",
type="message",
content=[events.ItemContent(type="input_text", text=content)],
)
if message.get("role") == "assistant" and message.get("tool_calls"):
tc = message.get("tool_calls")[0]
return events.ConversationItem(
type="function_call",
call_id=tc["id"],
name=tc["function"]["name"],
arguments=tc["function"]["arguments"],
)
logger.error(f"Unhandled message type in _from_universal_context_message: {message}")
@staticmethod
def _to_openai_realtime_function_format(function: FunctionSchema) -> Dict[str, Any]:
@@ -94,4 +225,18 @@ class OpenAIRealtimeLLMAdapter(BaseLLMAdapter):
List of function definitions in OpenAI Realtime format.
"""
functions_schema = tools_schema.standard_tools
return [self._to_openai_realtime_function_format(func) for func in functions_schema]
standard_tools = [
self._to_openai_realtime_function_format(func) for func in functions_schema
]
# For backward compatibility, OpenAI Realtime can still be used with
# tools in dict format, even though it always uses `LLMContext` under
# the hood (via `LLMContext.from_openai_context()`).
# To support this behavior, we use "shimmed" custom tools here.
# (We maintain this backward compatibility because users aren't
# *knowingly* opting into the new `LLMContext`.)
shimmed_tools = []
if tools_schema.custom_tools:
shimmed_tools = tools_schema.custom_tools.get(AdapterType.SHIM, [])
return standard_tools + shimmed_tools

View File

@@ -14,20 +14,41 @@ from pipecat.services.llm_service import LLMService
class LLMSwitcher(ServiceSwitcher[StrategyType]):
"""A pipeline that switches between different LLMs at runtime."""
"""A pipeline that switches between different LLMs at runtime.
Example::
llm_switcher = LLMSwitcher(
llms=[openai_llm, anthropic_llm],
strategy_type=ServiceSwitcherStrategyManual
)
"""
def __init__(self, llms: List[LLMService], strategy_type: Type[StrategyType]):
"""Initialize the service switcher with a list of LLMs and a switching strategy."""
"""Initialize the service switcher with a list of LLMs and a switching strategy.
Args:
llms: List of LLM services to switch between.
strategy_type: The strategy class to use for switching between LLMs.
"""
super().__init__(llms, strategy_type)
@property
def llms(self) -> List[LLMService]:
"""Get the list of LLMs managed by this switcher."""
"""Get the list of LLMs managed by this switcher.
Returns:
List of LLM services managed by this switcher.
"""
return self.services
@property
def active_llm(self) -> Optional[LLMService]:
"""Get the currently active LLM, if any."""
"""Get the currently active LLM.
Returns:
The currently active LLM service, or None if no LLM is active.
"""
return self.strategy.active_service
async def run_inference(self, context: LLMContext) -> Optional[str]:

View File

@@ -70,11 +70,15 @@ class PipelineRunner(BaseObject):
"""
logger.debug(f"Runner {self} started running {task}")
self._tasks[task.name] = task
params = PipelineTaskParams(loop=self._loop)
# PipelineTask handles asyncio.CancelledError to shutdown the pipeline
# properly and re-raises it in case there's more cleanup to do.
try:
params = PipelineTaskParams(loop=self._loop)
await task.run(params)
except asyncio.CancelledError:
await self._cancel()
pass
del self._tasks[task.name]
# Cleanup base object.

View File

@@ -21,10 +21,22 @@ from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
class ServiceSwitcherStrategy:
"""Base class for service switching strategies."""
"""Base class for service switching strategies.
Note:
Strategy classes are instantiated internally by ServiceSwitcher.
Developers should pass the strategy class (not an instance) to ServiceSwitcher.
"""
def __init__(self, services: List[FrameProcessor]):
"""Initialize the service switcher strategy with a list of services."""
"""Initialize the service switcher strategy with a list of services.
Note:
This is called internally by ServiceSwitcher. Do not instantiate directly.
Args:
services: List of frame processors to switch between.
"""
self.services = services
self.active_service: Optional[FrameProcessor] = None
@@ -46,10 +58,24 @@ class ServiceSwitcherStrategyManual(ServiceSwitcherStrategy):
This strategy allows the user to manually select which service is active.
The initial active service is the first one in the list.
Example::
stt_switcher = ServiceSwitcher(
services=[stt_1, stt_2],
strategy_type=ServiceSwitcherStrategyManual
)
"""
def __init__(self, services: List[FrameProcessor]):
"""Initialize the manual service switcher strategy with a list of services."""
"""Initialize the manual service switcher strategy with a list of services.
Note:
This is called internally by ServiceSwitcher. Do not instantiate directly.
Args:
services: List of frame processors to switch between.
"""
super().__init__(services)
self.active_service = services[0] if services else None
@@ -85,7 +111,12 @@ class ServiceSwitcher(ParallelPipeline, Generic[StrategyType]):
"""A pipeline that switches between different services at runtime."""
def __init__(self, services: List[FrameProcessor], strategy_type: Type[StrategyType]):
"""Initialize the service switcher with a list of services and a switching strategy."""
"""Initialize the service switcher with a list of services and a switching strategy.
Args:
services: List of frame processors to switch between.
strategy_type: The strategy class to use for switching between services.
"""
strategy = strategy_type(services)
super().__init__(*self._make_pipeline_definitions(services, strategy))
self.services = services
@@ -100,14 +131,20 @@ class ServiceSwitcher(ParallelPipeline, Generic[StrategyType]):
active_service: FrameProcessor,
direction: FrameDirection,
):
"""Initialize the service switcher filter with a strategy and direction."""
"""Initialize the service switcher filter with a strategy and direction.
Args:
wrapped_service: The service that this filter wraps.
active_service: The currently active service.
direction: The direction of frame flow to filter.
"""
self._wrapped_service = wrapped_service
self._active_service = active_service
async def filter(_: Frame) -> bool:
return self._wrapped_service == self._active_service
super().__init__(filter, direction)
self._wrapped_service = wrapped_service
self._active_service = active_service
super().__init__(filter, direction, filter_system_frames=True)
async def process_frame(self, frame, direction):
"""Process a frame through the filter, handling special internal filter-updating frames."""

View File

@@ -269,6 +269,9 @@ class PipelineTask(BasePipelineTask):
# StopFrame) has been received at the end of the pipeline.
self._pipeline_end_event = asyncio.Event()
# This event is set when the pipeline truly finishes.
self._pipeline_finished_event = asyncio.Event()
# This is the final pipeline. It is composed of a source processor,
# followed by the user pipeline, and ending with a sink processor. The
# source allows us to receive and react to upstream frames, and the sink
@@ -401,11 +404,7 @@ class PipelineTask(BasePipelineTask):
await self.queue_frame(EndFrame())
async def cancel(self):
"""Immediately stop the running pipeline.
Cancels all running tasks and stops frame processing without
waiting for completion.
"""
"""Request the running pipeline to cancel."""
if not self._finished:
await self._cancel()
@@ -417,51 +416,38 @@ class PipelineTask(BasePipelineTask):
"""
if self.has_finished():
return
cleanup_pipeline = True
# Setup processors.
await self._setup(params)
# Create all main tasks and wait for the main push task. This is the
# task that pushes frames to the very beginning of our pipeline (i.e. to
# our controlled source processor).
await self._create_tasks()
try:
# Setup processors.
await self._setup(params)
# Create all main tasks and wait of the main push task. This is the
# task that pushes frames to the very beginning of our pipeline (our
# controlled source processor).
push_task = await self._create_tasks()
await push_task
# We have already cleaned up the pipeline inside the task.
cleanup_pipeline = False
# Pipeline has finished nicely.
self._finished = True
# Wait for pipeline to finish.
await self._wait_for_pipeline_finished()
except asyncio.CancelledError:
# Raise exception back to the pipeline runner so it can cancel this
# task properly.
logger.debug(f"Pipeline task {self} got cancelled from outside...")
# We have been cancelled from outside, let's just cancel everything.
await self._cancel()
# Wait again for pipeline to finish. This time we have really
# cancelled, so it should really finish.
await self._wait_for_pipeline_finished()
# Re-raise in case there's more cleanup to do.
raise
finally:
# We can reach this point for different reasons:
#
# 1. The task has finished properly (e.g. `EndFrame`).
# 2. By calling `PipelineTask.cancel()`.
# 3. By asyncio task cancellation.
#
# Case (1) will execute the code below without issues because
# `self._finished` is true.
#
# Case (2) will execute the code below without issues because
# `self._cancelled` is true.
#
# Case (3) will raise the exception above (because we are cancelling
# the asyncio task). This will be then captured by the
# `PipelineRunner` which will call `PipelineTask.cancel()` and
# therefore becoming case (2).
if self._finished or self._cancelled:
logger.debug(f"Pipeline task {self} is finishing cleanup...")
await self._cancel_tasks()
await self._cleanup(cleanup_pipeline)
if self._check_dangling_tasks:
self._print_dangling_tasks()
self._finished = True
logger.debug(f"Pipeline task {self} has finished")
# 1. The pipeline task has finished (try case).
# 2. By an asyncio task cancellation (except case).
logger.debug(f"Pipeline task {self} is finishing...")
await self._cancel_tasks()
if self._check_dangling_tasks:
self._print_dangling_tasks()
self._finished = True
logger.debug(f"Pipeline task {self} has finished")
async def queue_frame(self, frame: Frame):
"""Queue a single frame to be pushed down the pipeline.
@@ -489,19 +475,7 @@ class PipelineTask(BasePipelineTask):
if not self._cancelled:
logger.debug(f"Cancelling pipeline task {self}")
self._cancelled = True
cancel_frame = CancelFrame()
# Make sure everything is cleaned up downstream. This is sent
# out-of-band from the main streaming task which is what we want since
# we want to cancel right away.
await self._pipeline.queue_frame(cancel_frame)
# Wait for CancelFrame to make it through the pipeline.
await self._wait_for_pipeline_end(cancel_frame)
# Only cancel the push task, we don't want to be able to process any
# other frame after cancel. Everything else will be cancelled in
# run().
if self._process_push_task:
await self._task_manager.cancel_task(self._process_push_task)
self._process_push_task = None
await self.queue_frame(CancelFrame())
async def _create_tasks(self):
"""Create and start all pipeline processing tasks."""
@@ -603,6 +577,17 @@ class PipelineTask(BasePipelineTask):
self._pipeline_end_event.clear()
# We are really done.
self._pipeline_finished_event.set()
async def _wait_for_pipeline_finished(self):
await self._pipeline_finished_event.wait()
self._pipeline_finished_event.clear()
# Make sure we wait for the main task to complete.
if self._process_push_task:
await self._process_push_task
self._process_push_task = None
async def _setup(self, params: PipelineTaskParams):
"""Set up the pipeline task and all processors."""
mgr_params = TaskManagerParams(loop=params.loop)

View File

@@ -189,7 +189,7 @@ class TaskObserver(BaseObserver):
if isinstance(data, FramePushed):
if on_push_frame_deprecated:
await observer.on_push_frame(
data.src, data.dst, data.frame, data.direction, data.timestamp
data.source, data.destination, data.frame, data.direction, data.timestamp
)
else:
await observer.on_push_frame(data)

View File

@@ -17,7 +17,7 @@ service-specific adapter.
import base64
import io
from dataclasses import dataclass
from typing import Any, List, Optional, TypeAlias, Union
from typing import TYPE_CHECKING, Any, List, Optional, TypeAlias, Union
from loguru import logger
from openai._types import NOT_GIVEN as OPEN_AI_NOT_GIVEN
@@ -28,9 +28,12 @@ from openai.types.chat import (
)
from PIL import Image
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.adapters.schemas.tools_schema import AdapterType, ToolsSchema
from pipecat.frames.frames import AudioRawFrame
if TYPE_CHECKING:
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
# "Re-export" types from OpenAI that we're using as universal context types.
# NOTE: if universal message types need to someday diverge from OpenAI's, we
# should consider managing our own definitions. But we should do so carefully,
@@ -65,6 +68,34 @@ class LLMContext:
and content formatting.
"""
@staticmethod
def from_openai_context(openai_context: "OpenAILLMContext") -> "LLMContext":
"""Create a universal LLM context from an OpenAI-specific context.
NOTE: this should only be used internally, for facilitating migration
from OpenAILLMContext to LLMContext. New user code should use
LLMContext directly.
Args:
openai_context: The OpenAI LLM context to convert.
Returns:
New LLMContext instance with converted messages and settings.
"""
# Convert tools to ToolsSchema if needed.
# If the tools are already a ToolsSchema, this is a no-op.
# Otherwise, we wrap them in a shim ToolsSchema.
converted_tools = openai_context.tools
if isinstance(converted_tools, list):
converted_tools = ToolsSchema(
standard_tools=[], custom_tools={AdapterType.SHIM: converted_tools}
)
return LLMContext(
messages=openai_context.get_messages(),
tools=converted_tools,
tool_choice=openai_context.tool_choice,
)
def __init__(
self,
messages: Optional[List[LLMContextMessage]] = None,
@@ -82,6 +113,46 @@ class LLMContext:
self._tools: ToolsSchema | NotGiven = LLMContext._normalize_and_validate_tools(tools)
self._tool_choice: LLMContextToolChoice | NotGiven = tool_choice
@property
def messages(self) -> List[LLMContextMessage]:
"""Get the current messages list.
NOTE: This is equivalent to calling `get_messages()` with no filter. If
you want to filter out LLM-specific messages that don't pertain to your
LLM, use `get_messages()` directly.
Returns:
List of conversation messages.
"""
return self.get_messages()
def get_messages_for_persistent_storage(self) -> List[LLMContextMessage]:
"""Get messages suitable for persistent storage.
NOTE: the only reason this method exists is because we're "silently"
switching from OpenAILLMContext to LLMContext under the hood in some
services and don't want to trip up users who may have been relying on
this method, which is part of the public API of OpenAILLMContext but
doesn't need to be for LLMContext.
.. deprecated::
Use `get_messages()` instead.
Returns:
List of conversation messages.
"""
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"get_messages_for_persistent_storage() is deprecated, use get_messages() instead.",
DeprecationWarning,
stacklevel=2,
)
return self.get_messages()
def get_messages(self, llm_specific_filter: Optional[str] = None) -> List[LLMContextMessage]:
"""Get the current messages list.
@@ -89,7 +160,8 @@ class LLMContext:
llm_specific_filter: Optional filter to return LLM-specific
messages for the given LLM, in addition to the standard
messages. If messages end up being filtered, an error will be
logged.
logged; this is intended to catch accidental use of
incompatible LLM-specific messages.
Returns:
List of conversation messages.

View File

@@ -290,6 +290,12 @@ class LLMUserAggregator(LLMContextAggregator):
await self._handle_llm_messages_update(frame)
elif isinstance(frame, LLMSetToolsFrame):
self.set_tools(frame.tools)
# Push the LLMSetToolsFrame as well, since speech-to-speech LLM
# services (like OpenAI Realtime) may need to know about tool
# changes; unlike text-based LLM services they won't just "pick up
# the change" on the next LLM run, as the LLM is continuously
# running.
await self.push_frame(frame, direction)
elif isinstance(frame, LLMSetToolChoiceFrame):
self.set_tool_choice(frame.tool_choice)
elif isinstance(frame, SpeechControlParamsFrame):

View File

@@ -12,7 +12,7 @@ allowing for flexible frame filtering logic in processing pipelines.
from typing import Awaitable, Callable
from pipecat.frames.frames import EndFrame, Frame, SystemFrame
from pipecat.frames.frames import CancelFrame, EndFrame, Frame, StartFrame, SystemFrame
from pipecat.processors.frame_processor import FrameDirection, FrameProcessor
@@ -28,6 +28,7 @@ class FunctionFilter(FrameProcessor):
self,
filter: Callable[[Frame], Awaitable[bool]],
direction: FrameDirection = FrameDirection.DOWNSTREAM,
filter_system_frames: bool = False,
):
"""Initialize the function filter.
@@ -36,22 +37,32 @@ class FunctionFilter(FrameProcessor):
frame should pass through, False otherwise.
direction: The direction to apply filtering. Only frames moving in
this direction will be filtered. Defaults to DOWNSTREAM.
filter_system_frames: Whether to filter system frames. Defaults to False.
"""
super().__init__()
self._filter = filter
self._direction = direction
self._filter_system_frames = filter_system_frames
#
# Frame processor
#
# Ignore system frames, end frames and frames that are not following the
# direction of this gate
def _should_passthrough_frame(self, frame, direction):
"""Check if a frame should pass through without filtering."""
# Ignore system frames, end frames and frames that are not following the
# direction of this gate
return isinstance(frame, (SystemFrame, EndFrame)) or direction != self._direction
# Always passthrough frames in the wrong direction
if direction != self._direction:
return True
# Always passthrough lifecycle frames
if isinstance(frame, (StartFrame, EndFrame, CancelFrame)):
return True
# If not filtering system frames, passthrough all other system frames
if not self._filter_system_frames and isinstance(frame, SystemFrame):
return True
return False
async def process_frame(self, frame: Frame, direction: FrameDirection):
"""Process a frame through the filter.

View File

@@ -1018,6 +1018,7 @@ class RTVIObserver(BaseObserver):
if (
isinstance(frame, (UserStartedSpeakingFrame, UserStoppedSpeakingFrame))
and (direction == FrameDirection.DOWNSTREAM)
and self._params.user_speaking_enabled
):
await self._handle_interruptions(frame)

View File

@@ -76,12 +76,14 @@ class DailyRoomConfig(BaseModel):
async def configure(
aiohttp_session: aiohttp.ClientSession,
*,
api_key: Optional[str] = None,
room_exp_duration: Optional[float] = 2.0,
token_exp_duration: Optional[float] = 2.0,
sip_caller_phone: Optional[str] = None,
sip_enable_video: Optional[bool] = False,
sip_num_endpoints: Optional[int] = 1,
sip_codecs: Optional[Dict[str, List[str]]] = None,
room_properties: Optional[DailyRoomProperties] = None,
) -> DailyRoomConfig:
"""Configure Daily room URL and token with optional SIP capabilities.
@@ -91,6 +93,7 @@ async def configure(
Args:
aiohttp_session: HTTP session for making API requests.
api_key: Daily API key.
room_exp_duration: Room expiration time in hours.
token_exp_duration: Token expiration time in hours.
sip_caller_phone: Phone number or identifier for SIP display name.
@@ -99,6 +102,10 @@ async def configure(
sip_num_endpoints: Number of allowed SIP endpoints.
sip_codecs: Codecs to support for audio and video. If None, uses Daily defaults.
Example: {"audio": ["OPUS"], "video": ["H264"]}
room_properties: Optional DailyRoomProperties to use instead of building from
individual parameters. When provided, this overrides room_exp_duration and
SIP-related parameters. If not provided, properties are built from the
individual parameters as before.
Returns:
DailyRoomConfig: Object with room_url, token, and optional sip_endpoint.
@@ -115,18 +122,48 @@ async def configure(
# SIP-enabled room
sip_config = await configure(session, sip_caller_phone="+15551234567")
print(f"SIP endpoint: {sip_config.sip_endpoint}")
# Custom room properties with recording enabled
custom_props = DailyRoomProperties(
enable_recording="cloud",
max_participants=2,
)
config = await configure(session, room_properties=custom_props)
"""
# Check for required API key
api_key = os.getenv("DAILY_API_KEY")
api_key = api_key or os.getenv("DAILY_API_KEY")
if not api_key:
raise Exception(
"DAILY_API_KEY environment variable is required. "
"Get your API key from https://dashboard.daily.co/developers"
)
# Warn if both room_properties and individual parameters are provided
if room_properties is not None:
individual_params_provided = any(
[
room_exp_duration != 2.0,
token_exp_duration != 2.0,
sip_caller_phone is not None,
sip_enable_video is not False,
sip_num_endpoints != 1,
sip_codecs is not None,
]
)
if individual_params_provided:
logger.warning(
"Both room_properties and individual parameters (room_exp_duration, token_exp_duration, "
"sip_*) were provided. The room_properties will be used and individual parameters "
"will be ignored."
)
# Determine if SIP mode is enabled
sip_enabled = sip_caller_phone is not None
# If room_properties is provided, check if it has SIP configuration
if room_properties and room_properties.sip:
sip_enabled = True
daily_rest_helper = DailyRESTHelper(
daily_api_key=api_key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
@@ -150,27 +187,29 @@ async def configure(
room_name = f"{room_prefix}-{uuid.uuid4().hex[:8]}"
logger.info(f"Creating new Daily room: {room_name}")
# Calculate expiration time
expiration_time = time.time() + (room_exp_duration * 60 * 60)
# Use provided room_properties or build from parameters
if room_properties is None:
# Calculate expiration time
expiration_time = time.time() + (room_exp_duration * 60 * 60)
# Create room properties
room_properties = DailyRoomProperties(
exp=expiration_time,
eject_at_room_exp=True,
)
# Add SIP configuration if enabled
if sip_enabled:
sip_params = DailyRoomSipParams(
display_name=sip_caller_phone,
video=sip_enable_video,
sip_mode="dial-in",
num_endpoints=sip_num_endpoints,
codecs=sip_codecs,
# Create room properties
room_properties = DailyRoomProperties(
exp=expiration_time,
eject_at_room_exp=True,
)
room_properties.sip = sip_params
room_properties.enable_dialout = True # Enable outbound calls if needed
room_properties.start_video_off = not sip_enable_video # Voice-only by default
# Add SIP configuration if enabled
if sip_enabled:
sip_params = DailyRoomSipParams(
display_name=sip_caller_phone,
video=sip_enable_video,
sip_mode="dial-in",
num_endpoints=sip_num_endpoints,
codecs=sip_codecs,
)
room_properties.sip = sip_params
room_properties.enable_dialout = True # Enable outbound calls if needed
room_properties.start_video_off = not sip_enable_video # Voice-only by default
# Create room parameters
room_params = DailyRoomParams(name=room_name, properties=room_properties)

View File

@@ -70,16 +70,19 @@ import asyncio
import mimetypes
import os
import sys
import uuid
from contextlib import asynccontextmanager
from http import HTTPMethod
from pathlib import Path
from typing import Optional
from typing import Any, Dict, List, Optional, TypedDict
import aiohttp
from fastapi.responses import FileResponse
from fastapi.responses import FileResponse, Response
from loguru import logger
from pipecat.runner.types import (
DailyRunnerArguments,
RunnerArguments,
SmallWebRTCRunnerArguments,
WebSocketRunnerArguments,
)
@@ -166,6 +169,7 @@ def _create_server_app(
host: str = "localhost",
proxy: str,
esp32_mode: bool = False,
whatsapp_enabled: bool = False,
folder: Optional[str] = None,
):
"""Create FastAPI app with transport-specific routes."""
@@ -182,7 +186,8 @@ def _create_server_app(
# Set up transport-specific routes
if transport_type == "webrtc":
_setup_webrtc_routes(app, esp32_mode=esp32_mode, host=host, folder=folder)
_setup_whatsapp_routes(app)
if whatsapp_enabled:
_setup_whatsapp_routes(app)
elif transport_type == "daily":
_setup_daily_routes(app)
elif transport_type in TELEPHONY_TRANSPORTS:
@@ -200,8 +205,10 @@ def _setup_webrtc_routes(
try:
from pipecat_ai_small_webrtc_prebuilt.frontend import SmallWebRTCPrebuiltUI
from pipecat.transports.smallwebrtc.connection import SmallWebRTCConnection
from pipecat.transports.smallwebrtc.connection import IceServer, SmallWebRTCConnection
from pipecat.transports.smallwebrtc.request_handler import (
IceCandidate,
SmallWebRTCPatchRequest,
SmallWebRTCRequest,
SmallWebRTCRequestHandler,
)
@@ -209,6 +216,16 @@ def _setup_webrtc_routes(
logger.error(f"WebRTC transport dependencies not installed: {e}")
return
class IceConfig(TypedDict):
iceServers: List[IceServer]
class StartBotResult(TypedDict, total=False):
sessionId: str
iceConfig: Optional[IceConfig]
# In-memory store of active sessions: session_id -> session info
active_sessions: Dict[str, Dict[str, Any]] = {}
# Mount the frontend
app.mount("/client", SmallWebRTCPrebuiltUI)
@@ -254,6 +271,74 @@ def _setup_webrtc_routes(
)
return answer
@app.patch("/api/offer")
async def ice_candidate(request: SmallWebRTCPatchRequest):
"""Handle WebRTC new ice candidate requests."""
logger.debug(f"Received patch request: {request}")
await small_webrtc_handler.handle_patch_request(request)
return {"status": "success"}
@app.post("/start")
async def rtvi_start(request: Request):
"""Mimic Pipecat Cloud's /start endpoint."""
# Parse the request body
try:
request_data = await request.json()
logger.debug(f"Received request: {request_data}")
except Exception as e:
logger.error(f"Failed to parse request body: {e}")
request_data = {}
# Store session info immediately in memory, replicate the behavior expected on Pipecat Cloud
session_id = str(uuid.uuid4())
active_sessions[session_id] = request_data
result: StartBotResult = {"sessionId": session_id}
if request_data.get("enableDefaultIceServers"):
result["iceConfig"] = IceConfig(
iceServers=[IceServer(urls="stun:stun.l.google.com:19302")]
)
return result
@app.api_route(
"/sessions/{session_id}/{path:path}",
methods=["GET", "POST", "PUT", "PATCH", "DELETE"],
)
async def proxy_request(
session_id: str, path: str, request: Request, background_tasks: BackgroundTasks
):
"""Mimic Pipecat Cloud's proxy."""
active_session = active_sessions.get(session_id)
if active_session is None:
return Response(content="Invalid or not-yet-ready session_id", status_code=404)
if path.endswith("api/offer"):
# Parse the request body and convert to SmallWebRTCRequest
try:
request_data = await request.json()
if request.method == HTTPMethod.POST.value:
webrtc_request = SmallWebRTCRequest(
sdp=request_data["sdp"],
type=request_data["type"],
pc_id=request_data.get("pc_id"),
restart_pc=request_data.get("restart_pc"),
request_data=request_data,
)
return await offer(webrtc_request, background_tasks)
elif request.method == HTTPMethod.PATCH.value:
patch_request = SmallWebRTCPatchRequest(
pc_id=request_data["pc_id"],
candidates=[IceCandidate(**c) for c in request_data.get("candidates", [])],
)
return await ice_candidate(patch_request)
except Exception as e:
logger.error(f"Failed to parse WebRTC request: {e}")
return Response(content="Invalid WebRTC request", status_code=400)
logger.info(f"Received request for path: {path}")
return Response(status_code=200)
@asynccontextmanager
async def smallwebrtc_lifespan(app: FastAPI):
"""Manage FastAPI application lifecycle and cleanup connections."""
@@ -289,6 +374,29 @@ def _add_lifespan_to_app(app: FastAPI, new_lifespan):
def _setup_whatsapp_routes(app: FastAPI):
"""Set up WebRTC-specific routes."""
WHATSAPP_APP_SECRET = os.getenv("WHATSAPP_APP_SECRET")
WHATSAPP_PHONE_NUMBER_ID = os.getenv("WHATSAPP_PHONE_NUMBER_ID")
WHATSAPP_TOKEN = os.getenv("WHATSAPP_TOKEN")
WHATSAPP_WEBHOOK_VERIFICATION_TOKEN = os.getenv("WHATSAPP_WEBHOOK_VERIFICATION_TOKEN")
if not all(
[
WHATSAPP_APP_SECRET,
WHATSAPP_PHONE_NUMBER_ID,
WHATSAPP_TOKEN,
WHATSAPP_WEBHOOK_VERIFICATION_TOKEN,
]
):
logger.error(
"""Missing required environment variables for WhatsApp transport:
WHATSAPP_APP_SECRET
WHATSAPP_PHONE_NUMBER_ID
WHATSAPP_TOKEN
WHATSAPP_WEBHOOK_VERIFICATION_TOKEN
"""
)
return
try:
from pipecat_ai_small_webrtc_prebuilt.frontend import SmallWebRTCPrebuiltUI
@@ -300,24 +408,7 @@ def _setup_whatsapp_routes(app: FastAPI):
from pipecat.transports.whatsapp.api import WhatsAppWebhookRequest
from pipecat.transports.whatsapp.client import WhatsAppClient
except ImportError as e:
logger.error(f"WebRTC transport dependencies not installed: {e}")
return
WHATSAPP_TOKEN = os.getenv("WHATSAPP_TOKEN")
WHATSAPP_PHONE_NUMBER_ID = os.getenv("WHATSAPP_PHONE_NUMBER_ID")
WHATSAPP_WEBHOOK_VERIFICATION_TOKEN = os.getenv("WHATSAPP_WEBHOOK_VERIFICATION_TOKEN")
WHATSAPP_APP_SECRET = os.getenv("WHATSAPP_APP_SECRET")
if not all(
[
WHATSAPP_TOKEN,
WHATSAPP_PHONE_NUMBER_ID,
WHATSAPP_WEBHOOK_VERIFICATION_TOKEN,
]
):
logger.debug(
"Missing required environment variables for WhatsApp transport. Keeping it disabled."
)
logger.error(f"WhatsApp transport dependencies not installed: {e}")
return
# Global WhatsApp client instance
@@ -439,9 +530,9 @@ def _setup_daily_routes(app: FastAPI):
"""Set up Daily-specific routes."""
@app.get("/")
async def start_agent():
async def create_room_and_start_agent():
"""Launch a Daily bot and redirect to room."""
print("Starting bot with Daily transport")
print("Starting bot with Daily transport and redirecting to Daily room")
import aiohttp
@@ -456,11 +547,11 @@ def _setup_daily_routes(app: FastAPI):
asyncio.create_task(bot_module.bot(runner_args))
return RedirectResponse(room_url)
async def _handle_rtvi_request(request: Request):
"""Common handler for both /start and /connect endpoints.
@app.post("/start")
async def start_agent(request: Request):
"""Handler for /start endpoints.
Expects POST body like::
{
"createDailyRoom": true,
"dailyRoomProperties": { "start_video_off": true },
@@ -477,47 +568,38 @@ def _setup_daily_routes(app: FastAPI):
logger.error(f"Failed to parse request body: {e}")
request_data = {}
# Extract the body data that should be passed to the bot
# This mimics Pipecat Cloud's behavior
bot_body = request_data.get("body", {})
create_daily_room = request_data.get("createDailyRoom", False)
body = request_data.get("body", {})
# Log the extracted body data for debugging
if bot_body:
logger.info(f"Extracted body data for bot: {bot_body}")
bot_module = _get_bot_module()
existing_room_url = os.getenv("DAILY_SAMPLE_ROOM_URL")
result = None
# Configure room if:
# 1. Explicitly requested via createDailyRoom in payload
# 2. Using pre-configured room from DAILY_SAMPLE_ROOM_URL env var
if create_daily_room or existing_room_url:
import aiohttp
from pipecat.runner.daily import configure
async with aiohttp.ClientSession() as session:
room_url, token = await configure(session)
runner_args = DailyRunnerArguments(room_url=room_url, token=token, body=body)
result = {
"dailyRoom": room_url,
"dailyToken": token,
"sessionId": str(uuid.uuid4()),
}
else:
logger.debug("No body data provided in request")
runner_args = RunnerArguments(body=body)
import aiohttp
# Start the bot in the background
asyncio.create_task(bot_module.bot(runner_args))
from pipecat.runner.daily import configure
async with aiohttp.ClientSession() as session:
room_url, token = await configure(session)
# Start the bot in the background with extracted body data
bot_module = _get_bot_module()
runner_args = DailyRunnerArguments(room_url=room_url, token=token, body=bot_body)
asyncio.create_task(bot_module.bot(runner_args))
# Match PCC /start endpoint response format:
return {"dailyRoom": room_url, "dailyToken": token}
@app.post("/start")
async def rtvi_start(request: Request):
"""Launch a Daily bot and return connection info for RTVI clients."""
return await _handle_rtvi_request(request)
@app.post("/connect")
async def rtvi_connect(request: Request):
"""Launch a Daily bot and return connection info for RTVI clients.
.. deprecated:: 0.0.78
Use /start instead. This endpoint will be removed in a future version.
"""
logger.warning(
"DEPRECATED: /connect endpoint is deprecated. Please use /start instead. "
"This endpoint will be removed in a future version."
)
return await _handle_rtvi_request(request)
return result
def _setup_telephony_routes(app: FastAPI, *, transport_type: str, proxy: str):
@@ -576,8 +658,6 @@ def _setup_telephony_routes(app: FastAPI, *, transport_type: str, proxy: str):
async def _run_daily_direct():
"""Run Daily bot with direct connection (no FastAPI server)."""
try:
import aiohttp
from pipecat.runner.daily import configure
except ImportError as e:
logger.error("Daily transport dependencies not installed.")
@@ -689,6 +769,12 @@ def main():
parser.add_argument(
"--verbose", "-v", action="count", default=0, help="Increase logging verbosity"
)
parser.add_argument(
"--whatsapp",
action="store_true",
default=False,
help="Ensure requried WhatsApp environment variables are present",
)
args = parser.parse_args()
@@ -708,10 +794,6 @@ def main():
logger.error("For ESP32, you need to specify `--host IP` so we can do SDP munging.")
return
if args.transport in TELEPHONY_TRANSPORTS and not args.proxy:
logger.error(f"For telephony transports, you need to specify `--proxy PROXY`.")
return
# Log level
logger.remove()
logger.add(sys.stderr, level="TRACE" if args.verbose else "DEBUG")
@@ -731,10 +813,11 @@ def main():
print()
if args.esp32:
print(f"🚀 Bot ready! (ESP32 mode)")
print(f" → Open http://{args.host}:{args.port}/client in your browser")
elif args.whatsapp:
print(f"🚀 Bot ready! (WhatsApp)")
else:
print(f"🚀 Bot ready!")
print(f" → Open http://{args.host}:{args.port}/client in your browser")
print(f" → Open http://{args.host}:{args.port}/client in your browser")
print()
elif args.transport == "daily":
print()
@@ -752,6 +835,7 @@ def main():
host=args.host,
proxy=args.proxy,
esp32_mode=args.esp32,
whatsapp_enabled=args.whatsapp,
folder=args.folder,
)

View File

@@ -20,9 +20,11 @@ from fastapi import WebSocket
class RunnerArguments:
"""Base class for runner session arguments."""
handle_sigint: bool = field(init=False)
handle_sigterm: bool = field(init=False)
pipeline_idle_timeout_secs: int = field(init=False)
# Use kw_only so subclasses don't need to worry about ordering.
handle_sigint: bool = field(init=False, kw_only=True)
handle_sigterm: bool = field(init=False, kw_only=True)
pipeline_idle_timeout_secs: int = field(init=False, kw_only=True)
body: Optional[Any] = field(default_factory=dict, kw_only=True)
def __post_init__(self):
self.handle_sigint = False
@@ -42,7 +44,6 @@ class DailyRunnerArguments(RunnerArguments):
room_url: str
token: Optional[str] = None
body: Optional[Any] = field(default_factory=dict)
@dataclass
@@ -55,7 +56,6 @@ class WebSocketRunnerArguments(RunnerArguments):
"""
websocket: WebSocket
body: Optional[Any] = field(default_factory=dict)
@dataclass

View File

@@ -108,6 +108,8 @@ class AssemblyAIConnectionParams(BaseModel):
end_of_turn_confidence_threshold: Confidence threshold for end-of-turn detection.
min_end_of_turn_silence_when_confident: Minimum silence duration when confident about end-of-turn.
max_turn_silence: Maximum silence duration before forcing end-of-turn.
keyterms_prompt: List of key terms to guide transcription. Will be JSON serialized before sending.
speech_model: Select between English and multilingual models. Defaults to "universal-streaming-english".
"""
sample_rate: int = 16000
@@ -117,3 +119,7 @@ class AssemblyAIConnectionParams(BaseModel):
end_of_turn_confidence_threshold: Optional[float] = None
min_end_of_turn_silence_when_confident: Optional[int] = None
max_turn_silence: Optional[int] = None
keyterms_prompt: Optional[List[str]] = None
speech_model: Literal["universal-streaming-english", "universal-streaming-multilingual"] = (
"universal-streaming-english"
)

View File

@@ -174,11 +174,16 @@ class AssemblyAISTTService(STTService):
def _build_ws_url(self) -> str:
"""Build WebSocket URL with query parameters using urllib.parse.urlencode."""
params = {
k: str(v).lower() if isinstance(v, bool) else v
for k, v in self._connection_params.model_dump().items()
if v is not None
}
params = {}
for k, v in self._connection_params.model_dump().items():
if v is not None:
if k == "keyterms_prompt":
params[k] = json.dumps(v)
elif isinstance(v, bool):
params[k] = str(v).lower()
else:
params[k] = v
if params:
query_string = urlencode(params)
return f"{self._api_endpoint_base_url}?{query_string}"
@@ -197,6 +202,8 @@ class AssemblyAISTTService(STTService):
)
self._connected = True
self._receive_task = self.create_task(self._receive_task_handler())
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"Failed to connect to AssemblyAI: {e}")
self._connected = False
@@ -238,6 +245,7 @@ class AssemblyAISTTService(STTService):
self._websocket = None
self._connected = False
self._receive_task = None
await self._call_event_handler("on_disconnected")
async def _receive_task_handler(self):
"""Handle incoming WebSocket messages."""

View File

@@ -235,6 +235,8 @@ class AsyncAITTSService(InterruptibleTTSService):
}
await self._get_websocket().send(json.dumps(init_msg))
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
@@ -252,6 +254,7 @@ class AsyncAITTSService(InterruptibleTTSService):
finally:
self._websocket = None
self._started = False
await self._call_event_handler("on_disconnected")
def _get_websocket(self):
if self._websocket:

View File

@@ -720,11 +720,11 @@ class AWSBedrockLLMService(LLMService):
additional_model_request_fields: Additional model-specific parameters.
"""
max_tokens: Optional[int] = Field(default_factory=lambda: 4096, ge=1)
temperature: Optional[float] = Field(default_factory=lambda: 0.7, ge=0.0, le=1.0)
top_p: Optional[float] = Field(default_factory=lambda: 0.999, ge=0.0, le=1.0)
max_tokens: Optional[int] = Field(default=None, ge=1)
temperature: Optional[float] = Field(default=None, ge=0.0, le=1.0)
top_p: Optional[float] = Field(default=None, ge=0.0, le=1.0)
stop_sequences: Optional[List[str]] = Field(default_factory=lambda: [])
latency: Optional[str] = Field(default_factory=lambda: "standard")
latency: Optional[str] = Field(default=None)
additional_model_request_fields: Optional[Dict[str, Any]] = Field(default_factory=dict)
def __init__(
@@ -801,6 +801,24 @@ class AWSBedrockLLMService(LLMService):
"""
return True
def _build_inference_config(self) -> Dict[str, Any]:
"""Build inference config with only the parameters that are set.
This prevents conflicts with models (e.g., Claude Sonnet 4.5) that don't
allow certain parameter combinations like temperature and top_p together.
Returns:
Dictionary containing only the inference parameters that are not None.
"""
inference_config = {}
if self._settings["max_tokens"] is not None:
inference_config["maxTokens"] = self._settings["max_tokens"]
if self._settings["temperature"] is not None:
inference_config["temperature"] = self._settings["temperature"]
if self._settings["top_p"] is not None:
inference_config["topP"] = self._settings["top_p"]
return inference_config
async def run_inference(self, context: LLMContext | OpenAILLMContext) -> Optional[str]:
"""Run a one-shot, out-of-band (i.e. out-of-pipeline) inference with the given LLM context.
@@ -826,16 +844,16 @@ class AWSBedrockLLMService(LLMService):
model_id = self.model_name
# Prepare request parameters
inference_config = self._build_inference_config()
request_params = {
"modelId": model_id,
"messages": messages,
"inferenceConfig": {
"maxTokens": 8192,
"temperature": 0.7,
"topP": 0.9,
},
}
if inference_config:
request_params["inferenceConfig"] = inference_config
if system:
request_params["system"] = system
@@ -974,21 +992,20 @@ class AWSBedrockLLMService(LLMService):
tools = params_from_context["tools"]
tool_choice = params_from_context["tool_choice"]
# Set up inference config
inference_config = {
"maxTokens": self._settings["max_tokens"],
"temperature": self._settings["temperature"],
"topP": self._settings["top_p"],
}
# Set up inference config - only include parameters that are set
inference_config = self._build_inference_config()
# Prepare request parameters
request_params = {
"modelId": self.model_name,
"messages": messages,
"inferenceConfig": inference_config,
"additionalModelRequestFields": self._settings["additional_model_request_fields"],
}
# Only add inference config if it has parameters
if inference_config:
request_params["inferenceConfig"] = inference_config
# Add system message
if system:
request_params["system"] = system

View File

@@ -8,8 +8,77 @@
This module provides specialized context aggregators and message handling for AWS Nova Sonic,
including conversation history management and role-specific message processing.
.. deprecated:: 0.0.91
AWS Nova Sonic no longer uses types from this module under the hood.
It now uses `LLMContext` and `LLMContextAggregatorPair`.
Using the new patterns should allow you to not need types from this module.
BEFORE:
```
# Setup
context = OpenAILLMContext(messages, tools)
context_aggregator = llm.create_context_aggregator(context)
# Context frame type
frame: OpenAILLMContextFrame
# Context type
context: AWSNovaSonicLLMContext
# or
context: OpenAILLMContext
```
AFTER:
```
# Setup
context = LLMContext(messages, tools)
context_aggregator = LLMContextAggregatorPair(context)
# Context frame type
frame: LLMContextFrame
# Context type
context: LLMContext
```
"""
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"Types in pipecat.services.aws.nova_sonic.context (or "
"pipecat.services.aws_nova_sonic.context) are deprecated. \n"
"AWS Nova Sonic no longer uses types from this module under the hood. \n"
"It now uses `LLMContext` and `LLMContextAggregatorPair`. \n"
"Using the new patterns should allow you to not need types from this module.\n\n"
"BEFORE:\n"
"```\n"
"# Setup\n"
"context = OpenAILLMContext(messages, tools)\n"
"context_aggregator = llm.create_context_aggregator(context)\n\n"
"# Context frame type\n"
"frame: OpenAILLMContextFrame\n\n"
"# Context type\n"
"context: AWSNovaSonicLLMContext\n"
"# or\n"
"context: OpenAILLMContext\n\n"
"```\n\n"
"AFTER:\n"
"```\n"
"# Setup\n"
"context = LLMContext(messages, tools)\n"
"context_aggregator = LLMContextAggregatorPair(context)\n\n"
"# Context frame type\n"
"frame: LLMContextFrame\n\n"
"# Context type\n"
"context: LLMContext\n\n"
"```",
DeprecationWarning,
stacklevel=2,
)
import copy
from dataclasses import dataclass, field
from enum import Enum

View File

@@ -25,7 +25,7 @@ from loguru import logger
from pydantic import BaseModel, Field
from pipecat.adapters.schemas.tools_schema import ToolsSchema
from pipecat.adapters.services.aws_nova_sonic_adapter import AWSNovaSonicLLMAdapter
from pipecat.adapters.services.aws_nova_sonic_adapter import AWSNovaSonicLLMAdapter, Role
from pipecat.frames.frames import (
BotStoppedSpeakingFrame,
CancelFrame,
@@ -33,35 +33,30 @@ from pipecat.frames.frames import (
Frame,
FunctionCallFromLLM,
InputAudioRawFrame,
InterimTranscriptionFrame,
InterruptionFrame,
LLMContextFrame,
LLMFullResponseEndFrame,
LLMFullResponseStartFrame,
LLMTextFrame,
StartFrame,
TranscriptionFrame,
TTSAudioRawFrame,
TTSStartedFrame,
TTSStoppedFrame,
TTSTextFrame,
UserStartedSpeakingFrame,
UserStoppedSpeakingFrame,
)
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import (
LLMAssistantAggregatorParams,
LLMUserAggregatorParams,
)
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.aggregators.openai_llm_context import (
OpenAILLMContext,
OpenAILLMContextFrame,
)
from pipecat.processors.frame_processor import FrameDirection
from pipecat.services.aws.nova_sonic.context import (
AWSNovaSonicAssistantContextAggregator,
AWSNovaSonicContextAggregatorPair,
AWSNovaSonicLLMContext,
AWSNovaSonicUserContextAggregator,
Role,
)
from pipecat.services.aws.nova_sonic.frames import AWSNovaSonicFunctionCallResultFrame
from pipecat.services.llm_service import LLMService
from pipecat.utils.time import time_now_iso8601
@@ -217,6 +212,11 @@ class AWSNovaSonicLLMService(LLMService):
system_instruction: System-level instruction for the model.
tools: Available tools/functions for the model to use.
send_transcription_frames: Whether to emit transcription frames.
.. deprecated:: 0.0.91
This parameter is deprecated and will be removed in a future version.
Transcription frames are always sent.
**kwargs: Additional arguments passed to the parent LLMService.
"""
super().__init__(**kwargs)
@@ -230,8 +230,20 @@ class AWSNovaSonicLLMService(LLMService):
self._params = params or Params()
self._system_instruction = system_instruction
self._tools = tools
self._send_transcription_frames = send_transcription_frames
self._context: Optional[AWSNovaSonicLLMContext] = None
if not send_transcription_frames:
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"`send_transcription_frames` is deprecated and will be removed in a future version. "
"Transcription frames are always sent.",
DeprecationWarning,
stacklevel=2,
)
self._context: Optional[LLMContext] = None
self._stream: Optional[
DuplexEventStream[
InvokeModelWithBidirectionalStreamInput,
@@ -244,12 +256,17 @@ class AWSNovaSonicLLMService(LLMService):
self._input_audio_content_name: Optional[str] = None
self._content_being_received: Optional[CurrentContent] = None
self._assistant_is_responding = False
self._may_need_repush_assistant_text = False
self._ready_to_send_context = False
self._handling_bot_stopped_speaking = False
self._triggering_assistant_response = False
self._waiting_for_trigger_transcription = False
self._disconnecting = False
self._connected_time: Optional[float] = None
self._wants_connection = False
self._user_text_buffer = ""
self._assistant_text_buffer = ""
self._completed_tool_calls = set()
file_path = files("pipecat.services.aws.nova_sonic").joinpath("ready.wav")
with wave.open(file_path.open("rb"), "rb") as wav_file:
@@ -302,12 +319,12 @@ class AWSNovaSonicLLMService(LLMService):
logger.debug("Resetting conversation")
await self._handle_bot_stopped_speaking(delay_to_catch_trailing_assistant_text=False)
# Carry over previous context through disconnect
# Grab context to carry through disconnect/reconnect
context = self._context
await self._disconnect()
self._context = context
await self._disconnect()
await self._start_connecting()
await self._handle_context(context)
#
# frame processing
@@ -322,28 +339,35 @@ class AWSNovaSonicLLMService(LLMService):
"""
await super().process_frame(frame, direction)
if isinstance(frame, OpenAILLMContextFrame):
await self._handle_context(frame.context)
elif isinstance(frame, LLMContextFrame):
raise NotImplementedError(
"Universal LLMContext is not yet supported for AWS Nova Sonic."
if isinstance(frame, (LLMContextFrame, OpenAILLMContextFrame)):
context = (
frame.context
if isinstance(frame, LLMContextFrame)
else LLMContext.from_openai_context(frame.context)
)
await self._handle_context(context)
elif isinstance(frame, InputAudioRawFrame):
await self._handle_input_audio_frame(frame)
elif isinstance(frame, BotStoppedSpeakingFrame):
await self._handle_bot_stopped_speaking(delay_to_catch_trailing_assistant_text=True)
elif isinstance(frame, AWSNovaSonicFunctionCallResultFrame):
await self._handle_function_call_result(frame)
elif isinstance(frame, InterruptionFrame):
await self._handle_interruption_frame()
await self.push_frame(frame, direction)
async def _handle_context(self, context: OpenAILLMContext):
async def _handle_context(self, context: LLMContext):
if self._disconnecting:
return
if not self._context:
# We got our initial context - try to finish connecting
self._context = AWSNovaSonicLLMContext.upgrade_to_nova_sonic(
context, self._system_instruction
)
# We got our initial context
# Try to finish connecting
self._context = context
await self._finish_connecting_if_context_available()
else:
# We got an updated context
# Send results for any newly-completed function calls
await self._process_completed_function_calls(send_new_results=True)
async def _handle_input_audio_frame(self, frame: InputAudioRawFrame):
# Wait until we're done sending the assistant response trigger audio before sending audio
@@ -393,9 +417,9 @@ class AWSNovaSonicLLMService(LLMService):
else:
await finalize_assistant_response()
async def _handle_function_call_result(self, frame: AWSNovaSonicFunctionCallResultFrame):
result = frame.result_frame
await self._send_tool_result(tool_call_id=result.tool_call_id, result=result.result)
async def _handle_interruption_frame(self):
if self._assistant_is_responding:
self._may_need_repush_assistant_text = True
#
# LLM communication: lifecycle
@@ -431,6 +455,17 @@ class AWSNovaSonicLLMService(LLMService):
logger.error(f"{self} initialization error: {e}")
await self._disconnect()
async def _process_completed_function_calls(self, send_new_results: bool):
# Check for set of completed function calls in the context
for message in self._context.get_messages():
if message.get("role") and message.get("content") != "IN_PROGRESS":
tool_call_id = message.get("tool_call_id")
if tool_call_id and tool_call_id not in self._completed_tool_calls:
# Found a newly-completed function call - send the result to the service
if send_new_results:
await self._send_tool_result(tool_call_id, message.get("content"))
self._completed_tool_calls.add(tool_call_id)
async def _finish_connecting_if_context_available(self):
# We can only finish connecting once we've gotten our initial context and we're ready to
# send it
@@ -439,30 +474,38 @@ class AWSNovaSonicLLMService(LLMService):
logger.info("Finishing connecting (setting up session)...")
# Initialize our bookkeeping of already-completed tool calls in the
# context
await self._process_completed_function_calls(send_new_results=False)
# Read context
history = self._context.get_messages_for_initializing_history()
adapter: AWSNovaSonicLLMAdapter = self.get_llm_adapter()
llm_connection_params = adapter.get_llm_invocation_params(self._context)
# Send prompt start event, specifying tools.
# Tools from context take priority over self._tools.
tools = (
self._context.tools
if self._context.tools
else self.get_llm_adapter().from_standard_tools(self._tools)
llm_connection_params["tools"]
if llm_connection_params["tools"]
else adapter.from_standard_tools(self._tools)
)
logger.debug(f"Using tools: {tools}")
await self._send_prompt_start_event(tools)
# Send system instruction.
# Instruction from context takes priority over self._system_instruction.
# (NOTE: this prioritizing occurred automatically behind the scenes: the context was
# initialized with self._system_instruction and then updated itself from its messages when
# get_messages_for_initializing_history() was called).
logger.debug(f"Using system instruction: {history.system_instruction}")
if history.system_instruction:
await self._send_text_event(text=history.system_instruction, role=Role.SYSTEM)
system_instruction = (
llm_connection_params["system_instruction"]
if llm_connection_params["system_instruction"]
else self._system_instruction
)
logger.debug(f"Using system instruction: {system_instruction}")
if system_instruction:
await self._send_text_event(text=system_instruction, role=Role.SYSTEM)
# Send conversation history
for message in history.messages:
for message in llm_connection_params["messages"]:
# logger.debug(f"Seeding conversation history with message: {message}")
await self._send_text_event(text=message.text, role=message.role)
# Start audio input
@@ -492,9 +535,12 @@ class AWSNovaSonicLLMService(LLMService):
await self._send_session_end_events()
self._client = None
# Clean up context
self._context = None
# Clean up stream
if self._stream:
await self._stream.input_stream.close()
await self._stream.close()
self._stream = None
# NOTE: see explanation of HACK, below
@@ -510,15 +556,23 @@ class AWSNovaSonicLLMService(LLMService):
self._receive_task = None
# Reset remaining connection-specific state
# Should be all private state except:
# - _wants_connection
# - _assistant_response_trigger_audio
self._prompt_name = None
self._input_audio_content_name = None
self._content_being_received = None
self._assistant_is_responding = False
self._may_need_repush_assistant_text = False
self._ready_to_send_context = False
self._handling_bot_stopped_speaking = False
self._triggering_assistant_response = False
self._waiting_for_trigger_transcription = False
self._disconnecting = False
self._connected_time = None
self._user_text_buffer = ""
self._assistant_text_buffer = ""
self._completed_tool_calls = set()
logger.info("Finished disconnecting")
except Exception as e:
@@ -826,6 +880,10 @@ class AWSNovaSonicLLMService(LLMService):
# Handle the LLM completion ending
await self._handle_completion_end_event(event_json)
except Exception as e:
if self._disconnecting:
# Errors are kind of expected while disconnecting, so just
# ignore them and do nothing
return
logger.error(f"{self} error processing responses: {e}")
if self._wants_connection:
await self.reset_conversation()
@@ -956,7 +1014,7 @@ class AWSNovaSonicLLMService(LLMService):
async def _report_assistant_response_started(self):
logger.debug("Assistant response started")
# Report that the assistant has started their response.
# Report the start of the assistant response.
await self.push_frame(LLMFullResponseStartFrame())
# Report that equivalent of TTS (this is a speech-to-speech model) started
@@ -968,23 +1026,16 @@ class AWSNovaSonicLLMService(LLMService):
logger.debug(f"Assistant response text added: {text}")
# Report some text added to the ongoing assistant response
await self.push_frame(LLMTextFrame(text))
# Report some text added to the *equivalent* of TTS (this is a speech-to-speech model)
# Report the text of the assistant response.
await self.push_frame(TTSTextFrame(text))
# TODO: this is a (hopefully temporary) HACK. Here we directly manipulate the context rather
# than relying on the frames pushed to the assistant context aggregator. The pattern of
# receiving full-sentence text after the assistant has spoken does not easily fit with the
# Pipecat expectation of chunks of text streaming in while the assistant is speaking.
# Interruption handling was especially challenging. Rather than spend days trying to fit a
# square peg in a round hole, I decided on this hack for the time being. We can most cleanly
# abandon this hack if/when AWS Nova Sonic implements streaming smaller text chunks
# interspersed with audio. Note that when we move away from this hack, we need to make sure
# that on an interruption we avoid sending LLMFullResponseEndFrame, which gets the
# LLMAssistantContextAggregator into a bad state.
self._context.buffer_assistant_text(text)
# HACK: here we're also buffering the assistant text ourselves as a
# backup rather than relying solely on the assistant context aggregator
# to do it, because the text arrives from Nova Sonic only after all the
# assistant audio frames have been pushed, meaning that if an
# interruption frame were to arrive we would lose all of it (the text
# frames sitting in the queue would be wiped).
self._assistant_text_buffer += text
async def _report_assistant_response_ended(self):
if not self._context: # should never happen
@@ -992,14 +1043,34 @@ class AWSNovaSonicLLMService(LLMService):
logger.debug("Assistant response ended")
# Report that the assistant has finished their response.
# If an interruption frame arrived while the assistant was responding
# we may have lost all of the assistant text (see HACK, above), so
# re-push it downstream to the aggregator now.
if self._may_need_repush_assistant_text:
# Just in case, check that assistant text hasn't already made it
# into the context (sometimes it does, despite the interruption).
messages = self._context.get_messages()
last_message = messages[-1] if messages else None
if (
not last_message
or last_message.get("role") != "assistant"
or last_message.get("content") != self._assistant_text_buffer
):
# We also need to re-push the LLMFullResponseStartFrame since the
# TTSTextFrame would be ignored otherwise (the interruption frame
# would have cleared the assistant aggregator state).
await self.push_frame(LLMFullResponseStartFrame())
await self.push_frame(TTSTextFrame(self._assistant_text_buffer))
self._may_need_repush_assistant_text = False
# Report the end of the assistant response.
await self.push_frame(LLMFullResponseEndFrame())
# Report that equivalent of TTS (this is a speech-to-speech model) stopped.
await self.push_frame(TTSStoppedFrame())
# For an explanation of this hack, see _report_assistant_response_text_added.
self._context.flush_aggregated_assistant_text()
# Clear out the buffered assistant text
self._assistant_text_buffer = ""
#
# user transcription reporting
@@ -1016,33 +1087,67 @@ class AWSNovaSonicLLMService(LLMService):
logger.debug(f"User transcription text added: {text}")
# Manually add new user transcription text to context.
# We can't rely on the user context aggregator to do this since it's upstream from the LLM.
self._context.buffer_user_text(text)
# Report that some new user transcription text is available.
if self._send_transcription_frames:
await self.push_frame(
InterimTranscriptionFrame(text=text, user_id="", timestamp=time_now_iso8601())
)
# HACK: here we're buffering the user text ourselves rather than
# relying on the upstream user context aggregator to do it, because the
# text arrives in fairly large chunks spaced fairly far apart in time.
# That means the user text would be split between different messages in
# context. Even if we sent placeholder InterimTranscriptionFrames in
# between each TranscriptionFrame to tell the aggregator to hold off on
# finalizing the user message, the aggregator would likely get the last
# chunk too late.
self._user_text_buffer += f" {text}" if self._user_text_buffer else text
async def _report_user_transcription_ended(self):
if not self._context: # should never happen
return
# Manually add user transcription to context (if any has been buffered).
# We can't rely on the user context aggregator to do this since it's upstream from the LLM.
transcription = self._context.flush_aggregated_user_text()
if not transcription:
return
logger.debug(f"User transcription ended")
if self._send_transcription_frames:
await self.push_frame(
TranscriptionFrame(text=transcription, user_id="", timestamp=time_now_iso8601())
# Report to the upstream user context aggregator that some new user
# transcription text is available.
# HACK: Check if this transcription was triggered by our own
# assistant response trigger. If so, we need to wrap it with
# UserStarted/StoppedSpeakingFrames; otherwise the user aggregator
# would fire an EmulatedUserStartedSpeakingFrame, which would
# trigger an interruption, which would prevent us from writing the
# assistant response to context.
#
# Sending an EmulateUserStartedSpeakingFrame ourselves doesn't
# work: it just causes the interruption we're trying to avoid.
#
# Setting enable_emulated_vad_interruptions also doesn't work: at
# the time the user aggregator receives the TranscriptionFrame, it
# doesn't yet know the assistant has started responding, so it
# doesn't know that emulating the user starting to speak would
# cause an interruption.
should_wrap_in_user_started_stopped_speaking_frames = (
self._waiting_for_trigger_transcription
and self._user_text_buffer.strip().lower() == "ready"
)
# Start wrapping the upstream transcription in UserStarted/StoppedSpeakingFrames if needed
if should_wrap_in_user_started_stopped_speaking_frames:
logger.debug(
"Wrapping assistant response trigger transcription with upstream UserStarted/StoppedSpeakingFrames"
)
await self.push_frame(UserStartedSpeakingFrame(), direction=FrameDirection.UPSTREAM)
# Send the transcription upstream for the user context aggregator
frame = TranscriptionFrame(
text=self._user_text_buffer, user_id="", timestamp=time_now_iso8601()
)
await self.push_frame(frame, direction=FrameDirection.UPSTREAM)
# Finish wrapping the upstream transcription in UserStarted/StoppedSpeakingFrames if needed
if should_wrap_in_user_started_stopped_speaking_frames:
await self.push_frame(UserStoppedSpeakingFrame(), direction=FrameDirection.UPSTREAM)
# Clear out the buffered user text
self._user_text_buffer = ""
# We're no longer waiting for a trigger transcription
self._waiting_for_trigger_transcription = False
#
# context
@@ -1054,23 +1159,26 @@ class AWSNovaSonicLLMService(LLMService):
*,
user_params: LLMUserAggregatorParams = LLMUserAggregatorParams(),
assistant_params: LLMAssistantAggregatorParams = LLMAssistantAggregatorParams(),
) -> AWSNovaSonicContextAggregatorPair:
) -> LLMContextAggregatorPair:
"""Create context aggregator pair for managing conversation context.
NOTE: this method exists only for backward compatibility. New code
should instead do:
context = LLMContext(...)
context_aggregator = LLMContextAggregatorPair(context)
Args:
context: The OpenAI LLM context to upgrade.
context: The OpenAI LLM context.
user_params: Parameters for the user context aggregator.
assistant_params: Parameters for the assistant context aggregator.
Returns:
A pair of user and assistant context aggregators.
"""
context.set_llm_adapter(self.get_llm_adapter())
user = AWSNovaSonicUserContextAggregator(context=context, params=user_params)
assistant = AWSNovaSonicAssistantContextAggregator(context=context, params=assistant_params)
return AWSNovaSonicContextAggregatorPair(user, assistant)
context = LLMContext.from_openai_context(context)
return LLMContextAggregatorPair(
context, user_params=user_params, assistant_params=assistant_params
)
#
# assistant response trigger (HACK)
@@ -1108,6 +1216,8 @@ class AWSNovaSonicLLMService(LLMService):
try:
logger.debug("Sending assistant response trigger...")
self._waiting_for_trigger_transcription = True
chunk_duration = 0.02 # what we might get from InputAudioRawFrame
chunk_size = int(
chunk_duration

View File

@@ -286,6 +286,7 @@ class AWSTranscribeSTTService(STTService):
logger.info(f"{self} Successfully connected to AWS Transcribe")
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} Failed to connect to AWS Transcribe: {e}")
await self._disconnect()
@@ -310,6 +311,7 @@ class AWSTranscribeSTTService(STTService):
logger.warning(f"{self} Error closing WebSocket connection: {e}")
finally:
self._ws_client = None
await self._call_event_handler("on_disconnected")
def language_to_service_language(self, language: Language) -> str | None:
"""Convert internal language enum to AWS Transcribe language code.

View File

@@ -8,18 +8,14 @@
This module provides specialized context aggregators and message handling for AWS Nova Sonic,
including conversation history management and role-specific message processing.
.. deprecated:: 0.0.91
AWS Nova Sonic no longer uses types from this module under the hood.
It now uses `LLMContext` and `LLMContextAggregatorPair`.
Using the new patterns should allow you to not need types from this module.
See deprecation warning in pipecat.services.aws.nova_sonic.context for more
details.
"""
import warnings
from pipecat.services.aws.nova_sonic.context import *
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"Types in pipecat.services.aws_nova_sonic.context are deprecated. "
"Please use the equivalent types from "
"pipecat.services.aws.nova_sonic.context instead.",
DeprecationWarning,
stacklevel=2,
)

View File

@@ -38,7 +38,7 @@ class AzureRealtimeLLMService(OpenAIRealtimeLLMService):
Args:
api_key: The API key for the Azure OpenAI service.
base_url: The full Azure WebSocket endpoint URL including api-version and deployment.
Example: "wss://my-project.openai.azure.com/openai/realtime?api-version=2024-10-01-preview&deployment=my-realtime-deployment"
Example: "wss://my-project.openai.azure.com/openai/realtime?api-version=2025-04-01-preview&deployment=my-realtime-deployment"
**kwargs: Additional arguments passed to parent OpenAIRealtimeLLMService.
"""
super().__init__(base_url=base_url, api_key=api_key, **kwargs)
@@ -52,7 +52,7 @@ class AzureRealtimeLLMService(OpenAIRealtimeLLMService):
# handle disconnections in the send/recv code paths.
return
logger.info(f"Connecting to {self.base_url}, api key: {self.api_key}")
logger.info(f"Connecting to {self.base_url}")
self._websocket = await websocket_connect(
uri=self.base_url,
additional_headers={

View File

@@ -28,13 +28,12 @@ from pipecat.frames.frames import (
UserStoppedSpeakingFrame,
)
from pipecat.processors.frame_processor import FrameDirection
from pipecat.services.stt_service import STTService
from pipecat.services.stt_service import WebsocketSTTService
from pipecat.transcriptions.language import Language
from pipecat.utils.time import time_now_iso8601
from pipecat.utils.tracing.service_decorators import traced_stt
try:
import websockets
from websockets.asyncio.client import connect as websocket_connect
from websockets.protocol import State
except ModuleNotFoundError as e:
@@ -124,7 +123,7 @@ class CartesiaLiveOptions:
return cls(**json.loads(json_str))
class CartesiaSTTService(STTService):
class CartesiaSTTService(WebsocketSTTService):
"""Speech-to-text service using Cartesia Live API.
Provides real-time speech transcription through WebSocket connection
@@ -176,8 +175,7 @@ class CartesiaSTTService(STTService):
self.set_model_name(merged_options.model)
self._api_key = api_key
self._base_url = base_url or "api.cartesia.ai"
self._connection = None
self._receiver_task = None
self._receive_task = None
def can_generate_metrics(self) -> bool:
"""Check if the service can generate processing metrics.
@@ -214,6 +212,27 @@ class CartesiaSTTService(STTService):
await super().cancel(frame)
await self._disconnect()
async def start_metrics(self):
"""Start performance metrics collection for transcription processing."""
await self.start_ttfb_metrics()
await self.start_processing_metrics()
async def process_frame(self, frame: Frame, direction: FrameDirection):
"""Process incoming frames and handle speech events.
Args:
frame: The frame to process.
direction: Direction of frame flow in the pipeline.
"""
await super().process_frame(frame, direction)
if isinstance(frame, UserStartedSpeakingFrame):
await self.start_metrics()
elif isinstance(frame, UserStoppedSpeakingFrame):
# Send finalize command to flush the transcription session
if self._websocket and self._websocket.state is State.OPEN:
await self._websocket.send("finalize")
async def run_stt(self, audio: bytes) -> AsyncGenerator[Frame, None]:
"""Process audio data for speech-to-text transcription.
@@ -224,45 +243,71 @@ class CartesiaSTTService(STTService):
None - transcription results are handled via WebSocket responses.
"""
# If the connection is closed, due to timeout, we need to reconnect when the user starts speaking again
if not self._connection or self._connection.state is State.CLOSED:
if not self._websocket or self._websocket.state is State.CLOSED:
await self._connect()
await self._connection.send(audio)
await self._websocket.send(audio)
yield None
async def _connect(self):
params = self._settings.to_dict()
ws_url = f"wss://{self._base_url}/stt/websocket?{urllib.parse.urlencode(params)}"
logger.debug(f"Connecting to Cartesia: {ws_url}")
headers = {"Cartesia-Version": "2025-04-16", "X-API-Key": self._api_key}
await self._connect_websocket()
if self._websocket and not self._receive_task:
self._receive_task = asyncio.create_task(self._receive_task_handler(self._report_error))
async def _disconnect(self):
if self._receive_task:
await self.cancel_task(self._receive_task)
self._receive_task = None
await self._disconnect_websocket()
async def _connect_websocket(self):
try:
self._connection = await websocket_connect(ws_url, additional_headers=headers)
# Setup the receiver task to handle the incoming messages from the Cartesia server
if self._receiver_task is None or self._receiver_task.done():
self._receiver_task = asyncio.create_task(self._receive_messages())
logger.debug(f"Connected to Cartesia")
if self._websocket and self._websocket.state is State.OPEN:
return
logger.debug("Connecting to Cartesia STT")
params = self._settings.to_dict()
ws_url = f"wss://{self._base_url}/stt/websocket?{urllib.parse.urlencode(params)}"
headers = {"Cartesia-Version": "2025-04-16", "X-API-Key": self._api_key}
self._websocket = await websocket_connect(ws_url, additional_headers=headers)
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self}: unable to connect to Cartesia: {e}")
async def _receive_messages(self):
async def _disconnect_websocket(self):
try:
while True:
if not self._connection or self._connection.state is State.CLOSED:
break
message = await self._connection.recv()
try:
data = json.loads(message)
await self._process_response(data)
except json.JSONDecodeError:
logger.warning(f"Received non-JSON message: {message}")
except asyncio.CancelledError:
pass
except websockets.exceptions.ConnectionClosed as e:
logger.debug(f"WebSocket connection closed: {e}")
if self._websocket and self._websocket.state is State.OPEN:
logger.debug("Disconnecting from Cartesia STT")
await self._websocket.close()
except Exception as e:
logger.error(f"Error in message receiver: {e}")
logger.error(f"{self} error closing websocket: {e}")
finally:
self._websocket = None
await self._call_event_handler("on_disconnected")
def _get_websocket(self):
if self._websocket:
return self._websocket
raise Exception("Websocket not connected")
async def _process_messages(self):
async for message in self._get_websocket():
try:
data = json.loads(message)
await self._process_response(data)
except json.JSONDecodeError:
logger.warning(f"Received non-JSON message: {message}")
async def _receive_messages(self):
while True:
await self._process_messages()
# Cartesia times out after 5 minutes of innactivity (no keepalive
# mechanism is available). So, we try to reconnect.
logger.debug(f"{self} Cartesia connection was disconnected (timeout?), reconnecting")
await self._connect_websocket()
async def _process_response(self, data):
if "type" in data:
@@ -316,41 +361,3 @@ class CartesiaSTTService(STTService):
language,
)
)
async def _disconnect(self):
if self._receiver_task:
self._receiver_task.cancel()
try:
await self._receiver_task
except asyncio.CancelledError:
pass
except Exception as e:
logger.exception(f"Unexpected exception while cancelling task: {e}")
self._receiver_task = None
if self._connection and self._connection.state is State.OPEN:
logger.debug("Disconnecting from Cartesia")
await self._connection.close()
self._connection = None
async def start_metrics(self):
"""Start performance metrics collection for transcription processing."""
await self.start_ttfb_metrics()
await self.start_processing_metrics()
async def process_frame(self, frame: Frame, direction: FrameDirection):
"""Process incoming frames and handle speech events.
Args:
frame: The frame to process.
direction: Direction of frame flow in the pipeline.
"""
await super().process_frame(frame, direction)
if isinstance(frame, UserStartedSpeakingFrame):
await self.start_metrics()
elif isinstance(frame, UserStoppedSpeakingFrame):
# Send finalize command to flush the transcription session
if self._connection and self._connection.state is State.OPEN:
await self._connection.send("finalize")

View File

@@ -48,6 +48,26 @@ except ModuleNotFoundError as e:
raise Exception(f"Missing module: {e}")
class GenerationConfig(BaseModel):
"""Configuration for Cartesia Sonic-3 generation parameters.
Sonic-3 interprets these parameters as guidance to ensure natural speech.
Test against your content for best results.
Parameters:
volume: Volume multiplier for generated speech. Valid range: [0.5, 2.0]. Default is 1.0.
speed: Speed multiplier for generated speech. Valid range: [0.6, 1.5]. Default is 1.0.
emotion: Single emotion string to guide the emotional tone. Examples include neutral,
angry, excited, content, sad, scared. Over 60 emotions are supported. For best
results, use with recommended voices: Leo, Jace, Kyle, Gavin, Maya, Tessa, Dana,
and Marian.
"""
volume: Optional[float] = None
speed: Optional[float] = None
emotion: Optional[str] = None
def language_to_cartesia_language(language: Language) -> Optional[str]:
"""Convert a Language enum to Cartesia language code.
@@ -101,16 +121,20 @@ class CartesiaTTSService(AudioContextWordTTSService):
Parameters:
language: Language to use for synthesis.
speed: Voice speed control.
emotion: List of emotion controls.
speed: Voice speed control for non-Sonic-3 models (literal values).
emotion: List of emotion controls for non-Sonic-3 models.
.. deprecated:: 0.0.68
The `emotion` parameter is deprecated and will be removed in a future version.
generation_config: Generation configuration for Sonic-3 models. Includes volume,
speed (numeric), and emotion (string) parameters.
"""
language: Optional[Language] = Language.EN
speed: Optional[Literal["slow", "normal", "fast"]] = None
emotion: Optional[List[str]] = []
generation_config: Optional[GenerationConfig] = None
def __init__(
self,
@@ -119,7 +143,7 @@ class CartesiaTTSService(AudioContextWordTTSService):
voice_id: str,
cartesia_version: str = "2025-04-16",
url: str = "wss://api.cartesia.ai/tts/websocket",
model: str = "sonic-2",
model: str = "sonic-3",
sample_rate: Optional[int] = None,
encoding: str = "pcm_s16le",
container: str = "raw",
@@ -135,7 +159,7 @@ class CartesiaTTSService(AudioContextWordTTSService):
voice_id: ID of the voice to use for synthesis.
cartesia_version: API version string for Cartesia service.
url: WebSocket URL for Cartesia TTS API.
model: TTS model to use (e.g., "sonic-2").
model: TTS model to use (e.g., "sonic-3").
sample_rate: Audio sample rate. If None, uses default.
encoding: Audio encoding format.
container: Audio container format.
@@ -179,6 +203,7 @@ class CartesiaTTSService(AudioContextWordTTSService):
else "en",
"speed": params.speed,
"emotion": params.emotion,
"generation_config": params.generation_config,
}
self.set_model_name(model)
self.set_voice(voice_id)
@@ -297,6 +322,11 @@ class CartesiaTTSService(AudioContextWordTTSService):
if self._settings["speed"]:
msg["speed"] = self._settings["speed"]
if self._settings["generation_config"]:
msg["generation_config"] = self._settings["generation_config"].model_dump(
exclude_none=True
)
return json.dumps(msg)
async def start(self, frame: StartFrame):
@@ -344,10 +374,11 @@ class CartesiaTTSService(AudioContextWordTTSService):
try:
if self._websocket and self._websocket.state is State.OPEN:
return
logger.debug("Connecting to Cartesia")
logger.debug("Connecting to Cartesia TTS")
self._websocket = await websocket_connect(
f"{self._url}?api_key={self._api_key}&cartesia_version={self._cartesia_version}"
)
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
@@ -365,6 +396,7 @@ class CartesiaTTSService(AudioContextWordTTSService):
finally:
self._context_id = None
self._websocket = None
await self._call_event_handler("on_disconnected")
def _get_websocket(self):
if self._websocket:
@@ -480,23 +512,27 @@ class CartesiaHttpTTSService(TTSService):
Parameters:
language: Language to use for synthesis.
speed: Voice speed control.
emotion: List of emotion controls.
speed: Voice speed control for non-Sonic-3 models (literal values).
emotion: List of emotion controls for non-Sonic-3 models.
.. deprecated:: 0.0.68
The `emotion` parameter is deprecated and will be removed in a future version.
generation_config: Generation configuration for Sonic-3 models. Includes volume,
speed (numeric), and emotion (string) parameters.
"""
language: Optional[Language] = Language.EN
speed: Optional[Literal["slow", "normal", "fast"]] = None
emotion: Optional[List[str]] = Field(default_factory=list)
generation_config: Optional[GenerationConfig] = None
def __init__(
self,
*,
api_key: str,
voice_id: str,
model: str = "sonic-2",
model: str = "sonic-3",
base_url: str = "https://api.cartesia.ai",
cartesia_version: str = "2024-11-13",
sample_rate: Optional[int] = None,
@@ -510,7 +546,7 @@ class CartesiaHttpTTSService(TTSService):
Args:
api_key: Cartesia API key for authentication.
voice_id: ID of the voice to use for synthesis.
model: TTS model to use (e.g., "sonic-2").
model: TTS model to use (e.g., "sonic-3").
base_url: Base URL for Cartesia HTTP API.
cartesia_version: API version string for Cartesia service.
sample_rate: Audio sample rate. If None, uses default.
@@ -537,6 +573,7 @@ class CartesiaHttpTTSService(TTSService):
else "en",
"speed": params.speed,
"emotion": params.emotion,
"generation_config": params.generation_config,
}
self.set_voice(voice_id)
self.set_model_name(model)
@@ -630,6 +667,11 @@ class CartesiaHttpTTSService(TTSService):
if self._settings["speed"]:
payload["speed"] = self._settings["speed"]
if self._settings["generation_config"]:
payload["generation_config"] = self._settings["generation_config"].model_dump(
exclude_none=True
)
yield TTSStartedFrame()
session = await self._client._get_session()

View File

@@ -205,6 +205,7 @@ class DeepgramFluxSTTService(WebsocketSTTService):
additional_headers={"Authorization": f"Token {self._api_key}"},
)
logger.debug("Connected to Deepgram Flux Websocket")
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
@@ -225,6 +226,9 @@ class DeepgramFluxSTTService(WebsocketSTTService):
await self._websocket.close()
except Exception as e:
logger.error(f"{self} error closing websocket: {e}")
finally:
self._websocket = None
await self._call_event_handler("on_disconnected")
async def _send_close_stream(self) -> None:
"""Sends a CloseStream control message to the Deepgram Flux WebSocket API.

View File

@@ -168,16 +168,24 @@ def build_elevenlabs_voice_settings(
def calculate_word_times(
alignment_info: Mapping[str, Any], cumulative_time: float
) -> List[Tuple[str, float]]:
alignment_info: Mapping[str, Any],
cumulative_time: float,
partial_word: str = "",
partial_word_start_time: float = 0.0,
) -> tuple[List[Tuple[str, float]], str, float]:
"""Calculate word timestamps from character alignment information.
Args:
alignment_info: Character alignment data from ElevenLabs API.
cumulative_time: Base time offset for this chunk.
partial_word: Partial word carried over from previous chunk.
partial_word_start_time: Start time of the partial word.
Returns:
List of (word, timestamp) tuples.
Tuple of (word_times, new_partial_word, new_partial_word_start_time):
- word_times: List of (word, timestamp) tuples for complete words
- new_partial_word: Incomplete word at end of chunk (empty if chunk ends with space)
- new_partial_word_start_time: Start time of the incomplete word
"""
chars = alignment_info["chars"]
char_start_times_ms = alignment_info["charStartTimesMs"]
@@ -186,41 +194,37 @@ def calculate_word_times(
logger.error(
f"calculate_word_times: length mismatch - chars={len(chars)}, times={len(char_start_times_ms)}"
)
return []
return ([], partial_word, partial_word_start_time)
# Build words and track their start positions
words = []
word_start_indices = []
current_word = ""
word_start_index = None
word_start_times = []
current_word = partial_word # Start with any partial word from previous chunk
word_start_time = partial_word_start_time if partial_word else None
for i, char in enumerate(chars):
if char == " ":
# End of current word
if current_word: # Only add non-empty words
words.append(current_word)
word_start_indices.append(word_start_index)
word_start_times.append(word_start_time)
current_word = ""
word_start_index = None
word_start_time = None
else:
# Building a word
if word_start_index is None: # First character of new word
word_start_index = i
if word_start_time is None: # First character of new word
# Convert from milliseconds to seconds and add cumulative offset
word_start_time = cumulative_time + (char_start_times_ms[i] / 1000.0)
current_word += char
# Handle the last word if there's no trailing space
if current_word and word_start_index is not None:
words.append(current_word)
word_start_indices.append(word_start_index)
# Build result for complete words
word_times = list(zip(words, word_start_times))
# Calculate timestamps for each word
word_times = []
for word, start_idx in zip(words, word_start_indices):
# Convert from milliseconds to seconds and add cumulative offset
start_time_seconds = cumulative_time + (char_start_times_ms[start_idx] / 1000.0)
word_times.append((word, start_time_seconds))
# Return any incomplete word at the end of this chunk
new_partial_word = current_word if current_word else ""
new_partial_word_start_time = word_start_time if word_start_time is not None else 0.0
return word_times
return (word_times, new_partial_word, new_partial_word_start_time)
class ElevenLabsTTSService(AudioContextWordTTSService):
@@ -332,6 +336,9 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
# there's an interruption or TTSStoppedFrame.
self._started = False
self._cumulative_time = 0
# Track partial words that span across alignment chunks
self._partial_word = ""
self._partial_word_start_time = 0.0
# Context management for v1 multi API
self._context_id = None
@@ -521,6 +528,7 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
url, max_size=16 * 1024 * 1024, additional_headers={"xi-api-key": self._api_key}
)
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
@@ -543,6 +551,7 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
self._started = False
self._context_id = None
self._websocket = None
await self._call_event_handler("on_disconnected")
def _get_websocket(self):
if self._websocket:
@@ -570,6 +579,8 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
logger.error(f"Error closing context on interruption: {e}")
self._context_id = None
self._started = False
self._partial_word = ""
self._partial_word_start_time = 0.0
async def _receive_messages(self):
"""Handle incoming WebSocket messages from ElevenLabs."""
@@ -609,7 +620,14 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
if msg.get("alignment"):
alignment = msg["alignment"]
word_times = calculate_word_times(alignment, self._cumulative_time)
word_times, self._partial_word, self._partial_word_start_time = (
calculate_word_times(
alignment,
self._cumulative_time,
self._partial_word,
self._partial_word_start_time,
)
)
if word_times:
await self.add_word_timestamps(word_times)
@@ -683,6 +701,8 @@ class ElevenLabsTTSService(AudioContextWordTTSService):
yield TTSStartedFrame()
self._started = True
self._cumulative_time = 0
self._partial_word = ""
self._partial_word_start_time = 0.0
# If a context ID does not exist, create a new one and
# register it. If an ID exists, that means the Pipeline is
# configured for allow_interruptions=False, so continue
@@ -756,6 +776,7 @@ class ElevenLabsHttpTTSService(WordTTSService):
base_url: str = "https://api.elevenlabs.io",
sample_rate: Optional[int] = None,
params: Optional[InputParams] = None,
aggregate_sentences: Optional[bool] = True,
**kwargs,
):
"""Initialize the ElevenLabs HTTP TTS service.
@@ -768,10 +789,11 @@ class ElevenLabsHttpTTSService(WordTTSService):
base_url: Base URL for ElevenLabs HTTP API.
sample_rate: Audio sample rate. If None, uses default.
params: Additional input parameters for voice customization.
aggregate_sentences: Whether to aggregate sentences within the TTSService.
**kwargs: Additional arguments passed to the parent service.
"""
super().__init__(
aggregate_sentences=True,
aggregate_sentences=aggregate_sentences,
push_text_frames=False,
push_stop_frames=True,
sample_rate=sample_rate,
@@ -809,6 +831,10 @@ class ElevenLabsHttpTTSService(WordTTSService):
# Store previous text for context within a turn
self._previous_text = ""
# Track partial words that span across alignment chunks
self._partial_word = ""
self._partial_word_start_time = 0.0
def language_to_service_language(self, language: Language) -> Optional[str]:
"""Convert pipecat Language to ElevenLabs language code.
@@ -836,6 +862,8 @@ class ElevenLabsHttpTTSService(WordTTSService):
self._cumulative_time = 0
self._started = False
self._previous_text = ""
self._partial_word = ""
self._partial_word_start_time = 0.0
logger.debug(f"{self}: Reset internal state")
async def start(self, frame: StartFrame):
@@ -870,11 +898,13 @@ class ElevenLabsHttpTTSService(WordTTSService):
def calculate_word_times(self, alignment_info: Mapping[str, Any]) -> List[Tuple[str, float]]:
"""Calculate word timing from character alignment data.
This method handles partial words that may span across multiple alignment chunks.
Args:
alignment_info: Character timing data from ElevenLabs.
Returns:
List of (word, timestamp) pairs.
List of (word, timestamp) pairs for complete words in this chunk.
Example input data::
@@ -900,30 +930,28 @@ class ElevenLabsHttpTTSService(WordTTSService):
# Build the words and find their start times
words = []
word_start_times = []
current_word = ""
first_char_idx = -1
# Start with any partial word from previous chunk
current_word = self._partial_word
word_start_time = self._partial_word_start_time if self._partial_word else None
for i, char in enumerate(chars):
if char == " ":
if current_word: # Only add non-empty words
words.append(current_word)
# Use time of the first character of the word, offset by cumulative time
word_start_times.append(
self._cumulative_time + char_start_times[first_char_idx]
)
word_start_times.append(word_start_time)
current_word = ""
first_char_idx = -1
word_start_time = None
else:
if not current_word: # This is the first character of a new word
first_char_idx = i
if word_start_time is None: # First character of a new word
# Use time of the first character of the word, offset by cumulative time
word_start_time = self._cumulative_time + char_start_times[i]
current_word += char
# Don't forget the last word if there's no trailing space
if current_word and first_char_idx >= 0:
words.append(current_word)
word_start_times.append(self._cumulative_time + char_start_times[first_char_idx])
# Store any incomplete word at the end of this chunk
self._partial_word = current_word if current_word else ""
self._partial_word_start_time = word_start_time if word_start_time is not None else 0.0
# Create word-time pairs
# Create word-time pairs for complete words only
word_times = list(zip(words, word_start_times))
return word_times
@@ -959,6 +987,9 @@ class ElevenLabsHttpTTSService(WordTTSService):
if self._voice_settings:
payload["voice_settings"] = self._voice_settings
if self._settings["apply_text_normalization"] is not None:
payload["apply_text_normalization"] = self._settings["apply_text_normalization"]
language = self._settings["language"]
if self._model_name in ELEVENLABS_MULTILINGUAL_MODELS and language:
payload["language_code"] = language
@@ -979,8 +1010,6 @@ class ElevenLabsHttpTTSService(WordTTSService):
}
if self._settings["optimize_streaming_latency"] is not None:
params["optimize_streaming_latency"] = self._settings["optimize_streaming_latency"]
if self._settings["apply_text_normalization"] is not None:
params["apply_text_normalization"] = self._settings["apply_text_normalization"]
try:
await self.start_ttfb_metrics()
@@ -1041,6 +1070,14 @@ class ElevenLabsHttpTTSService(WordTTSService):
logger.error(f"Error processing response: {e}", exc_info=True)
continue
# After processing all chunks, emit any remaining partial word
# since this is the end of the utterance
if self._partial_word:
final_word_time = [(self._partial_word, self._partial_word_start_time)]
await self.add_word_timestamps(final_word_time)
self._partial_word = ""
self._partial_word_start_time = 0.0
# After processing all chunks, add the total utterance duration
# to the cumulative time to ensure next utterance starts after this one
if utterance_duration > 0:

View File

@@ -225,6 +225,8 @@ class FishAudioTTSService(InterruptibleTTSService):
start_message = {"event": "start", "request": {"text": "", **self._settings}}
await self._websocket.send(ormsgpack.packb(start_message))
logger.debug("Sent start message to Fish Audio")
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"Fish Audio initialization error: {e}")
self._websocket = None
@@ -245,6 +247,7 @@ class FishAudioTTSService(InterruptibleTTSService):
self._request_id = None
self._started = False
self._websocket = None
await self._call_event_handler("on_disconnected")
async def flush_audio(self):
"""Flush any buffered audio by sending a flush event to Fish Audio."""

View File

@@ -17,6 +17,7 @@ import json
import random
import time
import uuid
import warnings
from dataclasses import dataclass
from enum import Enum
from typing import Any, Dict, List, Optional, Union
@@ -56,10 +57,12 @@ from pipecat.frames.frames import (
UserStoppedSpeakingFrame,
)
from pipecat.metrics.metrics import LLMTokenUsage
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import (
LLMAssistantAggregatorParams,
LLMUserAggregatorParams,
)
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.aggregators.openai_llm_context import (
OpenAILLMContext,
OpenAILLMContextFrame,
@@ -219,6 +222,10 @@ class GeminiLiveContext(OpenAILLMContext):
Provides Gemini-specific context management including system instruction
extraction and message format conversion for the Live API.
.. deprecated:: 0.0.92
Gemini Live no longer uses `GeminiLiveContext` under the hood.
It now uses `LLMContext`.
"""
@staticmethod
@@ -231,6 +238,22 @@ class GeminiLiveContext(OpenAILLMContext):
Returns:
The upgraded Gemini context instance.
"""
# This warning is here rather than `__init__` since `upgrade()` was the
# "main" way that GeminiLiveContext instances were created.
# Almost no users should be seeing this message anyway, as
# GeminiLiveContext instances were typically created under the hood:
# the user would pass an OpenAILLMContext instance, which would be
# upgraded without them necessarily knowing.
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"GeminiLiveContext is deprecated. "
"Gemini Live no longer uses GeminiLiveContext under the hood. "
"It now uses LLMContext.",
DeprecationWarning,
stacklevel=2,
)
if isinstance(obj, OpenAILLMContext) and not isinstance(obj, GeminiLiveContext):
logger.debug(f"Upgrading to Gemini Live Context: {obj}")
obj.__class__ = GeminiLiveContext
@@ -328,8 +351,28 @@ class GeminiLiveUserContextAggregator(OpenAIUserContextAggregator):
Extends OpenAI user aggregator to handle Gemini-specific message passing
while maintaining compatibility with the standard aggregation pipeline.
.. deprecated:: 0.0.92
Gemini Live no longer expects a `GeminiLiveUserContextAggregator`.
It now expects a `LLMUserAggregator`.
"""
def __init__(self, *args, **kwargs):
"""Initialize Gemini Live user context aggregator."""
# Almost no users should be seeing this message, as
# `GeminiLiveUserContextAggregator`` instances were typically created
# under the hood, as part of `llm.create_context_aggregator()`.
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"GeminiLiveUserContextAggregator is deprecated. "
"Gemini Live no longer expects a GeminiLiveUserContextAggregator. "
"It now expects a LLMUserAggregator.",
DeprecationWarning,
stacklevel=2,
)
super().__init__(*args, **kwargs)
async def process_frame(self, frame, direction):
"""Process incoming frames for user context aggregation.
@@ -349,8 +392,28 @@ class GeminiLiveAssistantContextAggregator(OpenAIAssistantContextAggregator):
Handles assistant response aggregation while filtering out LLMTextFrames
to prevent duplicate context entries, as Gemini Live pushes both
LLMTextFrames and TTSTextFrames.
.. deprecated:: 0.0.92
Gemini Live no longer uses `GeminiLiveAssistantContextAggregator` under the hood.
It now uses `LLMAssistantAggregator`.
"""
def __init__(self, *args, **kwargs):
"""Initialize Gemini Live assistant context aggregator."""
# Almost no users should be seeing this message, as
# `GeminiLiveAssistantContextAggregator` instances were typically
# created under the hood, as part of `llm.create_context_aggregator()`.
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"GeminiLiveAssistantContextAggregator is deprecated. "
"Gemini Live no longer uses GeminiLiveAssistantContextAggregator under the hood. "
"It now uses LLMAssistantAggregator.",
DeprecationWarning,
stacklevel=2,
)
super().__init__(*args, **kwargs)
async def process_frame(self, frame: Frame, direction: FrameDirection):
"""Process incoming frames for assistant context aggregation.
@@ -380,6 +443,10 @@ class GeminiLiveAssistantContextAggregator(OpenAIAssistantContextAggregator):
class GeminiLiveContextAggregatorPair:
"""Pair of user and assistant context aggregators for Gemini Live.
.. deprecated:: 0.0.92
`GeminiLiveContextAggregatorPair` is deprecated.
Use `LLMContextAggregatorPair` instead.
Parameters:
_user: The user context aggregator instance.
_assistant: The assistant context aggregator instance.
@@ -388,6 +455,19 @@ class GeminiLiveContextAggregatorPair:
_user: GeminiLiveUserContextAggregator
_assistant: GeminiLiveAssistantContextAggregator
def __post_init__(self):
# Almost no users should be seeing this message, as
# `GeminiLiveContextAggregatorPair` instances were typically created
# under the hood, with `llm.create_context_aggregator()`.
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"GeminiLiveContextAggregatorPair is deprecated. "
"Use LLMContextAggregatorPair instead.",
DeprecationWarning,
stacklevel=2,
)
def user(self) -> GeminiLiveUserContextAggregator:
"""Get the user context aggregator.
@@ -609,7 +689,7 @@ class GeminiLiveLLMService(LLMService):
self._run_llm_when_session_ready = False
self._user_is_speaking = False
self._bot_is_speaking = False
self._bot_is_responding = False
self._user_audio_buffer = bytearray()
self._user_transcription_buffer = ""
self._last_transcription_sent = ""
@@ -665,6 +745,9 @@ class GeminiLiveLLMService(LLMService):
# Initialize the API client. Subclasses can override this if needed.
self.create_client()
# Bookkeeping for tool calls
self._completed_tool_calls = set()
def create_client(self):
"""Create the Gemini API client instance. Subclasses can override this."""
self._client = Client(api_key=self._api_key, http_options=self._http_options)
@@ -787,9 +870,13 @@ class GeminiLiveLLMService(LLMService):
#
async def _handle_interruption(self):
await self._set_bot_is_speaking(False)
await self.push_frame(TTSStoppedFrame())
await self.push_frame(LLMFullResponseEndFrame())
if self._bot_is_responding:
await self._set_bot_is_responding(False)
if self._settings.get("modalities") == GeminiModalities.AUDIO:
await self.push_frame(TTSStoppedFrame())
# Do not send LLMFullResponseEndFrame here - an interruption
# already tells the assistant context aggregator that the response
# is over.
async def _handle_user_started_speaking(self, frame):
self._user_is_speaking = True
@@ -807,7 +894,6 @@ class GeminiLiveLLMService(LLMService):
#
# frame processing
#
# StartFrame, StopFrame, CancelFrame implemented in base class
#
@@ -820,7 +906,7 @@ class GeminiLiveLLMService(LLMService):
"""
# Defer EndFrame handling until after the bot turn is finished
if isinstance(frame, EndFrame):
if self._bot_is_speaking:
if self._bot_is_responding:
logger.debug("Deferring handling EndFrame until bot turn is finished")
self._end_frame_pending_bot_turn_finished = frame
return
@@ -829,22 +915,13 @@ class GeminiLiveLLMService(LLMService):
if isinstance(frame, TranscriptionFrame):
await self.push_frame(frame, direction)
elif isinstance(frame, OpenAILLMContextFrame):
context: GeminiLiveContext = GeminiLiveContext.upgrade(frame.context)
# For now, we'll only trigger inference here when either:
# 1. We have not seen a context frame before
# 2. The last message is a tool call result
if not self._context:
self._context = context
if frame.context.tools:
self._tools = frame.context.tools
await self._create_initial_response()
elif context.messages and context.messages[-1].get("role") == "tool":
# Support just one tool call per context frame for now
tool_result_message = context.messages[-1]
await self._tool_result(tool_result_message)
elif isinstance(frame, LLMContextFrame):
raise NotImplementedError("Universal LLMContext is not yet supported for Gemini Live.")
elif isinstance(frame, (LLMContextFrame, OpenAILLMContextFrame)):
context = (
frame.context
if isinstance(frame, LLMContextFrame)
else LLMContext.from_openai_context(frame.context)
)
await self._handle_context(context)
elif isinstance(frame, InputTextRawFrame):
await self._send_user_text(frame.text)
await self.push_frame(frame, direction)
@@ -883,13 +960,48 @@ class GeminiLiveLLMService(LLMService):
else:
await self.push_frame(frame, direction)
async def _set_bot_is_speaking(self, speaking: bool):
if self._bot_is_speaking == speaking:
async def _handle_context(self, context: LLMContext):
if not self._context:
# We got our initial context
self._context = context
if context.tools:
self._tools = context.tools
# Initialize our bookkeeping of already-completed tool calls in
# the context
await self._process_completed_function_calls(send_new_results=False)
await self._create_initial_response()
else:
# We got an updated context.
# This may contain a new user message or tool call result.
self._context = context
# Send results for newly-completed function calls, if any.
await self._process_completed_function_calls(send_new_results=True)
async def _process_completed_function_calls(self, send_new_results: bool):
# Check for set of completed function calls in the context
adapter: GeminiLLMAdapter = self.get_llm_adapter()
messages = adapter.get_llm_invocation_params(self._context).get("messages", [])
for message in messages:
if message.parts:
for part in message.parts:
if part.function_response:
tool_call_id = part.function_response.id
tool_name = part.function_response.name
if tool_call_id and tool_call_id not in self._completed_tool_calls:
# Found a newly-completed function call - send the result to the service
if send_new_results:
await self._tool_result(
tool_call_id, tool_name, part.function_response.response
)
self._completed_tool_calls.add(tool_call_id)
async def _set_bot_is_responding(self, responding: bool):
if self._bot_is_responding == responding:
return
self._bot_is_speaking = speaking
self._bot_is_responding = responding
if not self._bot_is_speaking and self._end_frame_pending_bot_turn_finished:
if not self._bot_is_responding and self._end_frame_pending_bot_turn_finished:
await self.queue_frame(self._end_frame_pending_bot_turn_finished)
self._end_frame_pending_bot_turn_finished = None
@@ -1116,6 +1228,7 @@ class GeminiLiveLLMService(LLMService):
if self._session:
await self._session.close()
self._session = None
self._completed_tool_calls = set()
self._disconnecting = False
except Exception as e:
logger.error(f"{self} error disconnecting: {e}")
@@ -1195,7 +1308,8 @@ class GeminiLiveLLMService(LLMService):
self._run_llm_when_session_ready = True
return
messages = self._context.get_messages_for_initializing_history()
adapter: GeminiLLMAdapter = self.get_llm_adapter()
messages = adapter.get_llm_invocation_params(self._context).get("messages", [])
if not messages:
return
@@ -1223,8 +1337,9 @@ class GeminiLiveLLMService(LLMService):
# Create a throwaway context just for the purpose of getting messages
# in the right format
context = GeminiLiveContext.upgrade(OpenAILLMContext(messages=messages_list))
messages = context.get_messages_for_initializing_history()
context = LLMContext(messages=messages_list)
adapter: GeminiLLMAdapter = self.get_llm_adapter()
messages = adapter.get_llm_invocation_params(context).get("messages", [])
if not messages:
return
@@ -1239,17 +1354,16 @@ class GeminiLiveLLMService(LLMService):
await self._handle_send_error(e)
@traced_gemini_live(operation="llm_tool_result")
async def _tool_result(self, tool_result_message):
async def _tool_result(
self, tool_call_id: str, tool_name: str, tool_result_message: Dict[str, Any]
):
"""Send tool result back to the API."""
if self._disconnecting or not self._session:
return
# For now we're shoving the name into the tool_call_id field, so this
# will work until we revisit that.
id = tool_result_message.get("tool_call_id")
name = tool_result_message.get("tool_call_name")
result = json.loads(tool_result_message.get("content") or "")
response = FunctionResponse(name=name, id=id, response=result)
response = FunctionResponse(name=tool_name, id=tool_call_id, response=tool_result_message)
try:
await self._session.send_tool_response(function_responses=response)
@@ -1277,7 +1391,10 @@ class GeminiLiveLLMService(LLMService):
# part.text is added when `modalities` is set to TEXT; otherwise, it's None
text = part.text
if text:
if not self._bot_text_buffer:
if not self._bot_is_responding:
# Update bot responding state and send service start frame
# (AUDIO modality case)
await self._set_bot_is_responding(True)
await self.push_frame(LLMFullResponseStartFrame())
self._bot_text_buffer += text
@@ -1288,6 +1405,8 @@ class GeminiLiveLLMService(LLMService):
if msg.server_content and msg.server_content.grounding_metadata:
self._accumulated_grounding_metadata = msg.server_content.grounding_metadata
# If we have no audio, stop here.
# All logic below this point pertains to the AUDIO modality.
inline_data = part.inline_data
if not inline_data:
return
@@ -1313,8 +1432,10 @@ class GeminiLiveLLMService(LLMService):
if not audio:
return
if not self._bot_is_speaking:
await self._set_bot_is_speaking(True)
# Update bot responding state and send service start frames
# (AUDIO modality case)
if not self._bot_is_responding:
await self._set_bot_is_responding(True)
await self.push_frame(TTSStartedFrame())
await self.push_frame(LLMFullResponseStartFrame())
@@ -1354,7 +1475,6 @@ class GeminiLiveLLMService(LLMService):
@traced_gemini_live(operation="llm_response")
async def _handle_msg_turn_complete(self, message: LiveServerMessage):
"""Handle the turn complete message."""
await self._set_bot_is_speaking(False)
text = self._bot_text_buffer
# Trace the complete LLM response (this will be handled by the decorator)
@@ -1373,13 +1493,15 @@ class GeminiLiveLLMService(LLMService):
self._search_result_buffer = ""
self._accumulated_grounding_metadata = None
# Only push the TTSStoppedFrame if the bot is outputting audio
# when text is found, modalities is set to TEXT and no audio
# is produced.
if not text:
await self.push_frame(TTSStoppedFrame())
await self.push_frame(LLMFullResponseEndFrame())
if self._bot_is_responding:
await self._set_bot_is_responding(False)
if not text:
# AUDIO modality case
await self.push_frame(TTSStoppedFrame())
await self.push_frame(LLMFullResponseEndFrame())
else:
# TEXT modality case
await self.push_frame(LLMFullResponseEndFrame())
@traced_stt
async def _handle_user_transcription(
@@ -1442,8 +1564,8 @@ class GeminiLiveLLMService(LLMService):
return
# This is the output transcription text when modalities is set to AUDIO.
# In this case, we push LLMTextFrame and TTSTextFrame to be handled by the
# downstream assistant context aggregator.
# In this case, we push TTSTextFrame to be handled by the downstream
# assistant context aggregator.
text = message.server_content.output_transcription.text
if not text:
@@ -1458,7 +1580,17 @@ class GeminiLiveLLMService(LLMService):
# Collect text for tracing
self._llm_output_buffer += text
await self.push_frame(LLMTextFrame(text=text))
# NOTE: Shoot. When using Vertex AI, output transcription messages
# arrive *before* the model_turn messages with audio, so we need to
# handle sending TTSStartedFrame and LLMFullResponseStartFrame here as
# well. These messages also contain much *more* text (it looks further
# ahead). That means that on an interruption our recorded context will
# contain some text that was actually never spoken.
if not self._bot_is_responding:
await self._set_bot_is_responding(True)
await self.push_frame(TTSStartedFrame())
await self.push_frame(LLMFullResponseStartFrame())
await self.push_frame(TTSTextFrame(text=text))
async def _handle_msg_grounding_metadata(self, message: LiveServerMessage):
@@ -1557,26 +1689,26 @@ class GeminiLiveLLMService(LLMService):
*,
user_params: LLMUserAggregatorParams = LLMUserAggregatorParams(),
assistant_params: LLMAssistantAggregatorParams = LLMAssistantAggregatorParams(),
) -> GeminiLiveContextAggregatorPair:
) -> LLMContextAggregatorPair:
"""Create an instance of GeminiLiveContextAggregatorPair from an OpenAILLMContext.
Constructor keyword arguments for both the user and assistant aggregators can be provided.
NOTE: this method exists only for backward compatibility. New code
should instead do:
context = LLMContext(...)
context_aggregator = LLMContextAggregatorPair(context)
Args:
context: The LLM context to use.
user_params: User aggregator parameters. Defaults to LLMUserAggregatorParams().
assistant_params: Assistant aggregator parameters. Defaults to LLMAssistantAggregatorParams().
Returns:
GeminiLiveContextAggregatorPair: A pair of context
aggregators, one for the user and one for the assistant,
encapsulated in an GeminiLiveContextAggregatorPair.
A pair of user and assistant context aggregators.
"""
context.set_llm_adapter(self.get_llm_adapter())
GeminiLiveContext.upgrade(context)
user = GeminiLiveUserContextAggregator(context, params=user_params)
context = LLMContext.from_openai_context(context)
assistant_params.expect_stripped_words = False
assistant = GeminiLiveAssistantContextAggregator(context, params=assistant_params)
return GeminiLiveContextAggregatorPair(_user=user, _assistant=assistant)
return LLMContextAggregatorPair(
context, user_params=user_params, assistant_params=assistant_params
)

View File

@@ -1034,6 +1034,23 @@ class GoogleLLMService(LLMService):
if context:
await self._process_context(context)
async def stop(self, frame):
"""Override stop to gracefully close the client."""
await super().stop(frame)
await self._close_client()
async def cancel(self, frame):
"""Override cancel to gracefully close the client."""
await super().cancel(frame)
await self._close_client()
async def _close_client(self):
try:
await self._client.aio.aclose()
except Exception:
# Do nothing - we're shutting down anyway
pass
def create_context_aggregator(
self,
context: OpenAILLMContext,

View File

@@ -730,6 +730,8 @@ class GoogleSTTService(STTService):
self._request_queue = asyncio.Queue()
self._streaming_task = self.create_task(self._stream_audio())
await self._call_event_handler("on_connected")
async def _disconnect(self):
"""Clean up streaming recognition resources."""
if self._streaming_task:
@@ -737,6 +739,8 @@ class GoogleSTTService(STTService):
await self.cancel_task(self._streaming_task)
self._streaming_task = None
await self._call_event_handler("on_disconnected")
async def _request_generator(self):
"""Generates requests for the streaming recognize method."""
recognizer_path = f"projects/{self._project_id}/locations/{self._location}/recognizers/_"

View File

@@ -184,11 +184,15 @@ class HumeTTSService(TTSService):
# Hume emits mono PCM at 48 kHz; downstream can resample if needed.
# We buffer audio bytes before sending to prevent glitches.
self._audio_bytes = b""
# Use version "2" by default if no description is provided
# Version "1" is needed when description is used
version = "1" if self._params.description is not None else "2"
async for chunk in self._client.tts.synthesize_json_streaming(
utterances=[utterance],
format=pcm_fmt,
instant_mode=True,
version="2",
version=version,
):
audio_b64 = getattr(chunk, "audio", None)
if not audio_b64:

View File

@@ -222,6 +222,7 @@ class LmntTTSService(InterruptibleTTSService):
# Send initialization message
await self._websocket.send(json.dumps(init_msg))
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
@@ -243,6 +244,7 @@ class LmntTTSService(InterruptibleTTSService):
finally:
self._started = False
self._websocket = None
await self._call_event_handler("on_disconnected")
def _get_websocket(self):
"""Get the WebSocket connection if available."""

View File

@@ -293,6 +293,8 @@ class NeuphonicTTSService(InterruptibleTTSService):
headers = {"x-api-key": self._api_key}
self._websocket = await websocket_connect(url, additional_headers=headers)
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
@@ -311,6 +313,7 @@ class NeuphonicTTSService(InterruptibleTTSService):
finally:
self._started = False
self._websocket = None
await self._call_event_handler("on_disconnected")
async def _receive_messages(self):
"""Receive and process messages from Neuphonic WebSocket."""

View File

@@ -4,7 +4,85 @@
# SPDX-License-Identifier: BSD 2-Clause License
#
"""OpenAI Realtime LLM context and aggregator implementations."""
"""OpenAI Realtime LLM context and aggregator implementations.
.. deprecated:: 0.0.92
OpenAI Realtime no longer uses types from this module under the hood.
It now uses `LLMContext` and `LLMContextAggregatorPair`.
Using the new patterns should allow you to not need types from this module.
BEFORE:
```
# Setup
context = OpenAILLMContext(messages, tools)
context_aggregator = llm.create_context_aggregator(context)
# Context aggregator type
context_aggregator: OpenAIContextAggregatorPair
# Context frame type
frame: OpenAILLMContextFrame
# Context type
context: OpenAIRealtimeLLMContext
# or
context: OpenAILLMContext
```
AFTER:
```
# Setup
context = LLMContext(messages, tools)
context_aggregator = LLMContextAggregatorPair(context)
# Context aggregator type
context_aggregator: LLMContextAggregatorPair
# Context frame type
frame: LLMContextFrame
# Context type
context: LLMContext
```
"""
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"Types in pipecat.services.openai.realtime.llm (or "
"pipecat.services.openai_realtime.llm) are deprecated. \n"
"OpenAI Realtime no longer uses types from this module under the hood. \n"
"It now uses `LLMContext` and `LLMContextAggregatorPair`. \n"
"Using the new patterns should allow you to not need types from this module.\n\n"
"BEFORE:\n"
"```\n"
"# Setup\n"
"context = OpenAILLMContext(messages, tools)\n"
"context_aggregator = llm.create_context_aggregator(context)\n\n"
"# Context aggregator type\n"
"context_aggregator: OpenAIContextAggregatorPair\n\n"
"# Context frame type\n"
"frame: OpenAILLMContextFrame\n\n"
"# Context type\n"
"context: OpenAIRealtimeLLMContext\n"
"# or\n"
"context: OpenAILLMContext\n\n"
"```\n\n"
"AFTER:\n"
"```\n"
"# Setup\n"
"context = LLMContext(messages, tools)\n"
"context_aggregator = LLMContextAggregatorPair(context)\n\n"
"# Context aggregator type\n"
"context_aggregator: LLMContextAggregatorPair\n\n"
"# Context frame type\n"
"frame: LLMContextFrame\n\n"
"# Context type\n"
"context: LLMContext\n\n"
"```\n",
)
import copy
import json

View File

@@ -4,7 +4,28 @@
# SPDX-License-Identifier: BSD 2-Clause License
#
"""Custom frame types for OpenAI Realtime API integration."""
"""Custom frame types for OpenAI Realtime API integration.
.. deprecated:: 0.0.92
OpenAI Realtime no longer uses types from this module under the hood.
It now works more like most LLM services in Pipecat, relying on updates to
its context, pushed by context aggregators, to update its internal state.
Listen for `LLMContextFrame`s for context updates.
"""
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"Types in pipecat.services.openai.realtime.frames are deprecated. \n"
"OpenAI Realtime no longer uses types from this module under the hood. \n\n"
"It now works more like other LLM services in Pipecat, relying on updates to \n"
"its context, pushed by context aggregators, to update its internal state.\n\n"
"Listen for `LLMContextFrame`s for context updates.\n"
)
from dataclasses import dataclass
from typing import TYPE_CHECKING

View File

@@ -14,7 +14,9 @@ from typing import Optional
from loguru import logger
from pipecat.adapters.services.open_ai_realtime_adapter import OpenAIRealtimeLLMAdapter
from pipecat.adapters.services.open_ai_realtime_adapter import (
OpenAIRealtimeLLMAdapter,
)
from pipecat.frames.frames import (
BotStoppedSpeakingFrame,
CancelFrame,
@@ -41,10 +43,12 @@ from pipecat.frames.frames import (
UserStoppedSpeakingFrame,
)
from pipecat.metrics.metrics import LLMTokenUsage
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response import (
LLMAssistantAggregatorParams,
LLMUserAggregatorParams,
)
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
from pipecat.processors.aggregators.openai_llm_context import (
OpenAILLMContext,
OpenAILLMContextFrame,
@@ -57,12 +61,6 @@ from pipecat.utils.time import time_now_iso8601
from pipecat.utils.tracing.service_decorators import traced_openai_realtime, traced_stt
from . import events
from .context import (
OpenAIRealtimeAssistantContextAggregator,
OpenAIRealtimeLLMContext,
OpenAIRealtimeUserContextAggregator,
)
from .frames import RealtimeFunctionCallResultFrame, RealtimeMessagesUpdateFrame
try:
from websockets.asyncio.client import connect as websocket_connect
@@ -108,22 +106,39 @@ class OpenAIRealtimeLLMService(LLMService):
base_url: str = "wss://api.openai.com/v1/realtime",
session_properties: Optional[events.SessionProperties] = None,
start_audio_paused: bool = False,
send_transcription_frames: bool = True,
send_transcription_frames: Optional[bool] = None,
**kwargs,
):
"""Initialize the OpenAI Realtime LLM service.
Args:
api_key: OpenAI API key for authentication.
model: OpenAI model name. Defaults to "gpt-4o-realtime-preview-2025-06-03".
model: OpenAI model name. Defaults to "gpt-realtime".
base_url: WebSocket base URL for the realtime API.
Defaults to "wss://api.openai.com/v1/realtime".
session_properties: Configuration properties for the realtime session.
If None, uses default SessionProperties.
start_audio_paused: Whether to start with audio input paused. Defaults to False.
send_transcription_frames: Whether to emit transcription frames. Defaults to True.
send_transcription_frames: Whether to emit transcription frames.
.. deprecated:: 0.0.92
This parameter is deprecated and will be removed in a future version.
Transcription frames are always sent.
**kwargs: Additional arguments passed to parent LLMService.
"""
if send_transcription_frames is not None:
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"`send_transcription_frames` is deprecated and will be removed in a future version. "
"Transcription frames are always sent.",
DeprecationWarning,
stacklevel=2,
)
full_url = f"{base_url}?model={model}"
super().__init__(base_url=full_url, **kwargs)
@@ -135,10 +150,11 @@ class OpenAIRealtimeLLMService(LLMService):
session_properties or events.SessionProperties()
)
self._audio_input_paused = start_audio_paused
self._send_transcription_frames = send_transcription_frames
self._websocket = None
self._receive_task = None
self._context = None
self._context: LLMContext = None
self._llm_needs_conversation_setup = True
self._disconnecting = False
self._api_session_ready = False
@@ -148,8 +164,8 @@ class OpenAIRealtimeLLMService(LLMService):
self._current_audio_response = None
self._messages_added_manually = {}
self._user_and_response_message_tuple = None
self._pending_function_calls = {} # Track function calls by call_id
self._completed_tool_calls = set()
self._register_event_handler("on_conversation_item_created")
self._register_event_handler("on_conversation_item_updated")
@@ -347,22 +363,13 @@ class OpenAIRealtimeLLMService(LLMService):
if isinstance(frame, TranscriptionFrame):
pass
elif isinstance(frame, OpenAILLMContextFrame):
context: OpenAIRealtimeLLMContext = OpenAIRealtimeLLMContext.upgrade_to_realtime(
elif isinstance(frame, (LLMContextFrame, OpenAILLMContextFrame)):
context = (
frame.context
if isinstance(frame, LLMContextFrame)
else LLMContext.from_openai_context(frame.context)
)
if not self._context:
self._context = context
elif frame.context is not self._context:
# If the context has changed, reset the conversation
self._context = context
await self.reset_conversation()
# Run the LLM at next opportunity
await self._create_response()
elif isinstance(frame, LLMContextFrame):
raise NotImplementedError(
"Universal LLMContext is not yet supported for OpenAI Realtime."
)
await self._handle_context(context)
elif isinstance(frame, InputAudioRawFrame):
if not self._audio_input_paused:
await self._send_user_audio(frame)
@@ -376,29 +383,33 @@ class OpenAIRealtimeLLMService(LLMService):
await self._handle_bot_stopped_speaking()
elif isinstance(frame, LLMMessagesAppendFrame):
await self._handle_messages_append(frame)
elif isinstance(frame, RealtimeMessagesUpdateFrame):
self._context = frame.context
elif isinstance(frame, LLMUpdateSettingsFrame):
self._session_properties = events.SessionProperties(**frame.settings)
await self._update_settings()
elif isinstance(frame, LLMSetToolsFrame):
await self._update_settings()
elif isinstance(frame, RealtimeFunctionCallResultFrame):
await self._handle_function_call_result(frame.result_frame)
await self.push_frame(frame, direction)
async def _handle_context(self, context: LLMContext):
if not self._context:
# We got our initial context
self._context = context
# Initialize our bookkeeping of already-completed tool calls in
# the context
await self._process_completed_function_calls(send_new_results=False)
# Run the LLM at next opportunity
await self._create_response()
else:
# We got an updated context.
# This may contain a new user message or tool call result.
self._context = context
# Send results for newly-completed function calls, if any.
await self._process_completed_function_calls(send_new_results=True)
async def _handle_messages_append(self, frame):
logger.error("!!! NEED TO IMPLEMENT MESSAGES APPEND")
async def _handle_function_call_result(self, frame):
item = events.ConversationItem(
type="function_call_output",
call_id=frame.tool_call_id,
output=json.dumps(frame.result),
)
await self.send_client_event(events.ConversationItemCreateEvent(item=item))
#
# websocket communication
#
@@ -439,16 +450,21 @@ class OpenAIRealtimeLLMService(LLMService):
if self._receive_task:
await self.cancel_task(self._receive_task, timeout=1.0)
self._receive_task = None
self._completed_tool_calls = set()
self._disconnecting = False
except Exception as e:
logger.error(f"{self} error disconnecting: {e}")
async def _ws_send(self, realtime_message):
try:
if self._websocket:
if not self._disconnecting and self._websocket:
await self._websocket.send(json.dumps(realtime_message))
except Exception as e:
if self._disconnecting:
if self._disconnecting or not self._websocket:
# We're in the process of disconnecting.
# (If not self._websocket, that could indicate that we
# somehow *started* the websocket send attempt while we still
# had a connection)
return
logger.error(f"Error sending message to websocket: {e}")
# In server-to-server contexts, a WebSocket error should be quite rare. Given how hard
@@ -459,13 +475,20 @@ class OpenAIRealtimeLLMService(LLMService):
async def _update_settings(self):
settings = self._session_properties
# tools given in the context override the tools in the session properties
if self._context and self._context.tools:
settings.tools = self._context.tools
# instructions in the context come from an initial "system" message in the
# messages list, and override instructions in the session properties
if self._context and self._context._session_instructions:
settings.instructions = self._context._session_instructions
if self._context:
adapter: OpenAIRealtimeLLMAdapter = self.get_llm_adapter()
llm_invocation_params = adapter.get_llm_invocation_params(self._context)
# tools given in the context override the tools in the session properties
if llm_invocation_params["tools"]:
settings.tools = llm_invocation_params["tools"]
# instructions in the context come from an initial "system" message in the
# messages list, and override instructions in the session properties
if llm_invocation_params["system_instruction"]:
settings.instructions = llm_invocation_params["system_instruction"]
await self.send_client_event(events.SessionUpdateEvent(session=settings))
#
@@ -571,12 +594,7 @@ class OpenAIRealtimeLLMService(LLMService):
del self._messages_added_manually[evt.item.id]
return
if evt.item.role == "user":
# We need to wait for completion of both user message and response message. Then we'll
# add both to the context. User message is complete when we have a "transcript" field
# that is not None. Response message is complete when we get a "response.done" event.
self._user_and_response_message_tuple = (evt.item, {"done": False, "output": []})
elif evt.item.role == "assistant":
if evt.item.role == "assistant":
self._current_assistant_response = evt.item
await self.push_frame(LLMFullResponseStartFrame())
@@ -587,11 +605,11 @@ class OpenAIRealtimeLLMService(LLMService):
# For now, no additional logic needed beyond the event handler call
async def _handle_evt_input_audio_transcription_delta(self, evt):
if self._send_transcription_frames:
await self.push_frame(
# no way to get a language code?
InterimTranscriptionFrame(evt.delta, "", time_now_iso8601(), result=evt)
)
await self.push_frame(
# no way to get a language code?
InterimTranscriptionFrame(evt.delta, "", time_now_iso8601(), result=evt),
direction=FrameDirection.UPSTREAM,
)
@traced_stt
async def _handle_user_transcription(
@@ -608,22 +626,12 @@ class OpenAIRealtimeLLMService(LLMService):
"""
await self._call_event_handler("on_conversation_item_updated", evt.item_id, None)
if self._send_transcription_frames:
await self.push_frame(
# no way to get a language code?
TranscriptionFrame(evt.transcript, "", time_now_iso8601(), result=evt)
)
await self._handle_user_transcription(evt.transcript, True, Language.EN)
pair = self._user_and_response_message_tuple
if pair:
user, assistant = pair
user.content[0].transcript = evt.transcript
if assistant["done"]:
self._user_and_response_message_tuple = None
self._context.add_user_content_item_as_message(user)
else:
# User message without preceding conversation.item.created. Bug?
logger.warning(f"Transcript for unknown user message: {evt}")
await self.push_frame(
# no way to get a language code?
TranscriptionFrame(evt.transcript, "", time_now_iso8601(), result=evt),
FrameDirection.UPSTREAM,
)
await self._handle_user_transcription(evt.transcript, True, Language.EN)
async def _handle_conversation_item_retrieved(self, evt: events.ConversationItemRetrieved):
futures = self._retrieve_conversation_item_futures.pop(evt.item.id, None)
@@ -653,26 +661,17 @@ class OpenAIRealtimeLLMService(LLMService):
# response content
for item in evt.response.output:
await self._call_event_handler("on_conversation_item_updated", item.id, item)
pair = self._user_and_response_message_tuple
if pair:
user, assistant = pair
assistant["done"] = True
assistant["output"] = evt.response.output
if user.content[0].transcript is not None:
self._user_and_response_message_tuple = None
self._context.add_user_content_item_as_message(user)
else:
# Response message without preceding user message (standalone response)
# Function calls in this response were already processed immediately when arguments were complete
logger.debug(f"Handling standalone response: {evt.response.id}")
async def _handle_evt_text_delta(self, evt):
# We receive text deltas (as opposed to audio transcript deltas) when
# the output modality is "text"
if evt.delta:
await self.push_frame(LLMTextFrame(evt.delta))
async def _handle_evt_audio_transcript_delta(self, evt):
# We receive audio transcript deltas (as opposed to text deltas) when
# the output modality is "audio" (the default)
if evt.delta:
await self.push_frame(LLMTextFrame(evt.delta))
await self.push_frame(TTSTextFrame(evt.delta))
async def _handle_evt_function_call_arguments_done(self, evt):
@@ -760,9 +759,11 @@ class OpenAIRealtimeLLMService(LLMService):
"""
logger.debug("Resetting conversation")
await self._disconnect()
if self._context:
self._context.llm_needs_settings_update = True
self._context.llm_needs_initial_messages = True
# Prepare to setup server-side conversation from local context again
self._llm_needs_conversation_setup = True
await self._process_completed_function_calls(send_new_results=False)
await self._connect()
@traced_openai_realtime(operation="llm_request")
@@ -771,19 +772,29 @@ class OpenAIRealtimeLLMService(LLMService):
self._run_llm_when_api_session_ready = True
return
if self._context.llm_needs_initial_messages:
messages = self._context.get_messages_for_initializing_history()
adapter: OpenAIRealtimeLLMAdapter = self.get_llm_adapter()
# Configure the LLM for this session if needed
if self._llm_needs_conversation_setup:
logger.debug(
f"Setting up conversation on OpenAI Realtime LLM service with initial messages: {adapter.get_messages_for_logging(self._context)}"
)
# Send initial messages
llm_invocation_params = adapter.get_llm_invocation_params(self._context)
messages = llm_invocation_params["messages"]
for item in messages:
evt = events.ConversationItemCreateEvent(item=item)
self._messages_added_manually[evt.item.id] = True
await self.send_client_event(evt)
self._context.llm_needs_initial_messages = False
if self._context.llm_needs_settings_update:
# Send new settings if needed
await self._update_settings()
self._context.llm_needs_settings_update = False
logger.debug(f"Creating response: {self._context.get_messages_for_logging()}")
# We're done configuring the LLM for this session
self._llm_needs_conversation_setup = False
logger.debug(f"Creating response")
await self.push_frame(LLMFullResponseStartFrame())
await self.start_processing_metrics()
@@ -794,19 +805,50 @@ class OpenAIRealtimeLLMService(LLMService):
)
)
async def _process_completed_function_calls(self, send_new_results: bool):
# Check for set of completed function calls in the context
sent_new_result = False
for message in self._context.get_messages():
if message.get("role") and message.get("content") != "IN_PROGRESS":
tool_call_id = message.get("tool_call_id")
if tool_call_id and tool_call_id not in self._completed_tool_calls:
# Found a newly-completed function call - send the result to the service
if send_new_results:
sent_new_result = True
await self._send_tool_result(tool_call_id, message.get("content"))
self._completed_tool_calls.add(tool_call_id)
# If we reported any new tool call results to the service, trigger
# another response
if sent_new_result:
await self._create_response()
async def _send_user_audio(self, frame):
payload = base64.b64encode(frame.audio).decode("utf-8")
await self.send_client_event(events.InputAudioBufferAppendEvent(audio=payload))
async def _send_tool_result(self, tool_call_id: str, result: str):
item = events.ConversationItem(
type="function_call_output",
call_id=tool_call_id,
output=json.dumps(result),
)
await self.send_client_event(events.ConversationItemCreateEvent(item=item))
def create_context_aggregator(
self,
context: OpenAILLMContext,
*,
user_params: LLMUserAggregatorParams = LLMUserAggregatorParams(),
assistant_params: LLMAssistantAggregatorParams = LLMAssistantAggregatorParams(),
) -> OpenAIContextAggregatorPair:
) -> LLMContextAggregatorPair:
"""Create an instance of OpenAIContextAggregatorPair from an OpenAILLMContext.
NOTE: this method exists only for backward compatibility. New code
should instead do:
context = LLMContext(...)
context_aggregator = LLMContextAggregatorPair(context)
Constructor keyword arguments for both the user and assistant aggregators can be provided.
Args:
@@ -819,11 +861,41 @@ class OpenAIRealtimeLLMService(LLMService):
the user and one for the assistant, encapsulated in an
OpenAIContextAggregatorPair.
"""
context.set_llm_adapter(self.get_llm_adapter())
OpenAIRealtimeLLMContext.upgrade_to_realtime(context)
user = OpenAIRealtimeUserContextAggregator(context, params=user_params)
# Log warning about transcription frame direction change in 0.0.92.
# We're putting this warning here rather than in the constructor so
# that it shows up for folks who haven't updated their code at all
# since 0.0.92, gives them a way to acknowledge and dismiss the
# warning, and encourages adoption of a new preferred pattern.
logger.warning(
"As of version 0.0.92, TranscriptionFrames and InterimTranscriptionFrames "
"now go upstream from OpenAIRealtimeLLMService, so if you're using "
"TranscriptProcessor, say, you'll want to adjust accordingly:\n\n"
"pipeline = Pipeline(\n"
" [\n"
" transport.input(),\n"
" context_aggregator.user(),\n\n"
" # BEFORE\n"
" llm,\n"
" transcript.user(),\n\n"
" # AFTER\n"
" transcript.user(),\n"
" llm,\n\n"
" transport.output(),\n"
" transcript.assistant(),\n"
" context_aggregator.assistant(),\n"
" ]\n"
")\n\n"
"Also, LLMTextFrames are no longer pushed from "
"OpenAIRealtimeLLMService when it's configured with "
"output_modalities=['audio']. Listen for TTSTextFrames instead.\n\n"
"Once you've made the appropriate changes (if needed), you can "
"dismiss this warning by updating to the new context-setup pattern:\n\n"
" context = LLMContext(messages, tools)\n"
" context_aggregator = LLMContextAggregatorPair(context)\n"
)
context = LLMContext.from_openai_context(context)
assistant_params.expect_stripped_words = False
assistant = OpenAIRealtimeAssistantContextAggregator(context, params=assistant_params)
return OpenAIContextAggregatorPair(_user=user, _assistant=assistant)
return LLMContextAggregatorPair(
context, user_params=user_params, assistant_params=assistant_params
)

View File

@@ -14,6 +14,7 @@ from typing import AsyncGenerator, Dict, Literal, Optional
from loguru import logger
from openai import AsyncOpenAI, BadRequestError
from pydantic import BaseModel
from pipecat.frames.frames import (
ErrorFrame,
@@ -55,6 +56,17 @@ class OpenAITTSService(TTSService):
OPENAI_SAMPLE_RATE = 24000 # OpenAI TTS always outputs at 24kHz
class InputParams(BaseModel):
"""Input parameters for OpenAI TTS configuration.
Parameters:
instructions: Instructions to guide voice synthesis behavior.
speed: Voice speed control (0.25 to 4.0, default 1.0).
"""
instructions: Optional[str] = None
speed: Optional[float] = None
def __init__(
self,
*,
@@ -65,6 +77,7 @@ class OpenAITTSService(TTSService):
sample_rate: Optional[int] = None,
instructions: Optional[str] = None,
speed: Optional[float] = None,
params: Optional[InputParams] = None,
**kwargs,
):
"""Initialize OpenAI TTS service.
@@ -77,7 +90,11 @@ class OpenAITTSService(TTSService):
sample_rate: Output audio sample rate in Hz. If None, uses OpenAI's default 24kHz.
instructions: Optional instructions to guide voice synthesis behavior.
speed: Voice speed control (0.25 to 4.0, default 1.0).
params: Optional synthesis controls (acting instructions, speed, ...).
**kwargs: Additional keyword arguments passed to TTSService.
.. deprecated:: 0.0.91
The `instructions` and `speed` parameters are deprecated, use `InputParams` instead.
"""
if sample_rate and sample_rate != self.OPENAI_SAMPLE_RATE:
logger.warning(
@@ -86,12 +103,26 @@ class OpenAITTSService(TTSService):
)
super().__init__(sample_rate=sample_rate, **kwargs)
self._speed = speed
self.set_model_name(model)
self.set_voice(voice)
self._instructions = instructions
self._client = AsyncOpenAI(api_key=api_key, base_url=base_url)
if instructions or speed:
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"The `instructions` and `speed` parameters are deprecated, use `InputParams` instead.",
DeprecationWarning,
stacklevel=2,
)
self._settings = {
"instructions": params.instructions if params else instructions,
"speed": params.speed if params else speed,
}
def can_generate_metrics(self) -> bool:
"""Check if this service can generate processing metrics.
@@ -144,11 +175,11 @@ class OpenAITTSService(TTSService):
"response_format": "pcm",
}
if self._instructions:
create_params["instructions"] = self._instructions
if self._settings["instructions"]:
create_params["instructions"] = self._settings["instructions"]
if self._speed:
create_params["speed"] = self._speed
if self._settings["speed"]:
create_params["speed"] = self._settings["speed"]
async with self._client.audio.speech.with_streaming_response.create(
**create_params

View File

@@ -4,18 +4,15 @@
# SPDX-License-Identifier: BSD 2-Clause License
#
"""OpenAI Realtime LLM context and aggregator implementations."""
"""OpenAI Realtime LLM context and aggregator implementations.
import warnings
.. deprecated:: 0.0.91
OpenAI Realtime no longer uses types from this module under the hood.
It now uses `LLMContext` and `LLMContextAggregatorPair`.
Using the new patterns should allow you to not need types from this module.
See deprecation warning in pipecat.services.openai.realtime.context for
more details.
"""
from pipecat.services.openai.realtime.context import *
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"Types in pipecat.services.openai_realtime.context are deprecated. "
"Please use the equivalent types from "
"pipecat.services.openai.realtime.context instead.",
DeprecationWarning,
stacklevel=2,
)

View File

@@ -70,7 +70,7 @@ class AzureRealtimeBetaLLMService(OpenAIRealtimeBetaLLMService):
# handle disconnections in the send/recv code paths.
return
logger.info(f"Connecting to {self.base_url}, api key: {self.api_key}")
logger.info(f"Connecting to {self.base_url}")
self._websocket = await websocket_connect(
uri=self.base_url,
additional_headers={

View File

@@ -269,6 +269,8 @@ class PlayHTTTSService(InterruptibleTTSService):
raise ValueError("WebSocket URL is not a string")
self._websocket = await websocket_connect(self._websocket_url)
await self._call_event_handler("on_connected")
except ValueError as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
@@ -291,6 +293,7 @@ class PlayHTTTSService(InterruptibleTTSService):
finally:
self._request_id = None
self._websocket = None
await self._call_event_handler("on_disconnected")
async def _get_websocket_url(self):
"""Retrieve WebSocket URL from PlayHT API."""

View File

@@ -255,6 +255,8 @@ class RimeTTSService(AudioContextWordTTSService):
url = f"{self._url}?{params}"
headers = {"Authorization": f"Bearer {self._api_key}"}
self._websocket = await websocket_connect(url, additional_headers=headers)
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
@@ -272,6 +274,7 @@ class RimeTTSService(AudioContextWordTTSService):
finally:
self._context_id = None
self._websocket = None
await self._call_event_handler("on_disconnected")
def _get_websocket(self):
"""Get active websocket connection or raise exception."""

View File

@@ -583,7 +583,9 @@ class RivaSegmentedSTTService(SegmentedSTTService):
self._config.language_code = self._language
@traced_stt
async def _handle_transcription(self, transcript: str, language: Optional[Language] = None):
async def _handle_transcription(
self, transcript: str, is_final: bool, language: Optional[Language] = None
):
"""Handle a transcription result with tracing."""
pass

View File

@@ -76,17 +76,29 @@ class SarvamHttpTTSService(TTSService):
Example::
tts = SarvamTTSService(
tts = SarvamHttpTTSService(
api_key="your-api-key",
voice_id="anushka",
model="bulbul:v2",
aiohttp_session=session,
params=SarvamTTSService.InputParams(
params=SarvamHttpTTSService.InputParams(
language=Language.HI,
pitch=0.1,
pace=1.2
)
)
# For bulbul v3 beta with any speaker:
tts_v3 = SarvamHttpTTSService(
api_key="your-api-key",
voice_id="speaker_name",
model="bulbul:v3,
aiohttp_session=session,
params=SarvamHttpTTSService.InputParams(
language=Language.HI,
temperature=0.8
)
)
"""
class InputParams(BaseModel):
@@ -105,6 +117,14 @@ class SarvamHttpTTSService(TTSService):
pace: Optional[float] = Field(default=1.0, ge=0.3, le=3.0)
loudness: Optional[float] = Field(default=1.0, ge=0.1, le=3.0)
enable_preprocessing: Optional[bool] = False
temperature: Optional[float] = Field(
default=0.6,
ge=0.01,
le=1.0,
description="Controls the randomness of the output for bulbul v3 beta. "
"Lower values make the output more focused and deterministic, while "
"higher values make it more random. Range: 0.01 to 1.0. Default: 0.6.",
)
def __init__(
self,
@@ -124,7 +144,7 @@ class SarvamHttpTTSService(TTSService):
api_key: Sarvam AI API subscription key.
aiohttp_session: Shared aiohttp session for making requests.
voice_id: Speaker voice ID (e.g., "anushka", "meera"). Defaults to "anushka".
model: TTS model to use ("bulbul:v1" or "bulbul:v2"). Defaults to "bulbul:v2".
model: TTS model to use ("bulbul:v2" or "bulbul:v3-beta" or "bulbul:v3"). Defaults to "bulbul:v2".
base_url: Sarvam AI API base URL. Defaults to "https://api.sarvam.ai".
sample_rate: Audio sample rate in Hz (8000, 16000, 22050, 24000). If None, uses default.
params: Additional voice and preprocessing parameters. If None, uses defaults.
@@ -138,16 +158,32 @@ class SarvamHttpTTSService(TTSService):
self._base_url = base_url
self._session = aiohttp_session
# Build base settings common to all models
self._settings = {
"language": (
self.language_to_service_language(params.language) if params.language else "en-IN"
),
"pitch": params.pitch,
"pace": params.pace,
"loudness": params.loudness,
"enable_preprocessing": params.enable_preprocessing,
}
# Add model-specific parameters
if model in ("bulbul:v3-beta", "bulbul:v3"):
self._settings.update(
{
"temperature": getattr(params, "temperature", 0.6),
"model": model,
}
)
else:
self._settings.update(
{
"pitch": params.pitch,
"pace": params.pace,
"loudness": params.loudness,
"model": model,
}
)
self.set_model_name(model)
self.set_voice(voice_id)
@@ -275,6 +311,18 @@ class SarvamTTSService(InterruptibleTTSService):
pace=1.2
)
)
# For bulbul v3 beta with any speaker and temperature:
# Note: pace and loudness are not supported for bulbul v3 and bulbul v3 beta
tts_v3 = SarvamTTSService(
api_key="your-api-key",
voice_id="speaker_name",
model="bulbul:v3",
params=SarvamTTSService.InputParams(
language=Language.HI,
temperature=0.8
)
)
"""
class InputParams(BaseModel):
@@ -310,6 +358,14 @@ class SarvamTTSService(InterruptibleTTSService):
output_audio_codec: Optional[str] = "linear16"
output_audio_bitrate: Optional[str] = "128k"
language: Optional[Language] = Language.EN
temperature: Optional[float] = Field(
default=0.6,
ge=0.01,
le=1.0,
description="Controls the randomness of the output for bulbul v3 beta. "
"Lower values make the output more focused and deterministic, while "
"higher values make it more random. Range: 0.01 to 1.0. Default: 0.6.",
)
def __init__(
self,
@@ -318,7 +374,6 @@ class SarvamTTSService(InterruptibleTTSService):
model: str = "bulbul:v2",
voice_id: str = "anushka",
url: str = "wss://api.sarvam.ai/text-to-speech/ws",
aiohttp_session: Optional[aiohttp.ClientSession] = None,
aggregate_sentences: Optional[bool] = True,
sample_rate: Optional[int] = None,
params: Optional[InputParams] = None,
@@ -329,13 +384,9 @@ class SarvamTTSService(InterruptibleTTSService):
Args:
api_key: Sarvam API key for authenticating TTS requests.
model: Identifier of the Sarvam speech model (default "bulbul:v2").
Supports "bulbul:v2", "bulbul:v3-beta" and "bulbul:v3".
voice_id: Voice identifier for synthesis (default "anushka").
url: WebSocket URL for connecting to the TTS backend (default production URL).
aiohttp_session: Optional shared aiohttp session. To maintain backward compatibility.
.. deprecated:: 0.0.81
aiohttp_session is no longer used. This parameter will be removed in a future version.
aggregate_sentences: Whether to merge multiple sentences into one audio chunk (default True).
sample_rate: Desired sample rate for the output audio in Hz (overrides default if set).
params: Optional input parameters to override global configuration.
@@ -356,30 +407,18 @@ class SarvamTTSService(InterruptibleTTSService):
**kwargs,
)
params = params or SarvamTTSService.InputParams()
if aiohttp_session is not None:
import warnings
with warnings.catch_warnings():
warnings.simplefilter("always")
warnings.warn(
"The 'aiohttp_session' parameter is deprecated and will be removed in a future version. ",
DeprecationWarning,
stacklevel=2,
)
# WebSocket endpoint URL
self._websocket_url = f"{url}?model={model}"
self._api_key = api_key
self.set_model_name(model)
self.set_voice(voice_id)
# Configuration parameters
# Build base settings common to all models
self._settings = {
"target_language_code": (
self.language_to_service_language(params.language) if params.language else "en-IN"
),
"pitch": params.pitch,
"pace": params.pace,
"speaker": voice_id,
"loudness": params.loudness,
"speech_sample_rate": 0,
"enable_preprocessing": params.enable_preprocessing,
"min_buffer_size": params.min_buffer_size,
@@ -387,6 +426,24 @@ class SarvamTTSService(InterruptibleTTSService):
"output_audio_codec": params.output_audio_codec,
"output_audio_bitrate": params.output_audio_bitrate,
}
# Add model-specific parameters
if model in ("bulbul:v3-beta", "bulbul:v3"):
self._settings.update(
{
"temperature": getattr(params, "temperature", 0.6),
"model": model,
}
)
else:
self._settings.update(
{
"pitch": params.pitch,
"pace": params.pace,
"loudness": params.loudness,
"model": model,
}
)
self._started = False
self._receive_task = None
@@ -525,6 +582,7 @@ class SarvamTTSService(InterruptibleTTSService):
logger.debug("Connected to Sarvam TTS Websocket")
await self._send_config()
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} initialization error: {e}")
self._websocket = None
@@ -556,6 +614,10 @@ class SarvamTTSService(InterruptibleTTSService):
await self._websocket.close()
except Exception as e:
logger.error(f"{self} error closing websocket: {e}")
finally:
self._started = False
self._websocket = None
await self._call_event_handler("on_disconnected")
def _get_websocket(self):
if self._websocket:

View File

@@ -7,9 +7,12 @@
"""Simli video service for real-time avatar generation."""
import asyncio
import warnings
from typing import Optional
import numpy as np
from loguru import logger
from pydantic import BaseModel
from pipecat.frames.frames import (
CancelFrame,
@@ -41,30 +44,103 @@ class SimliVideoService(FrameProcessor):
audio resampling, video frame processing, and connection management.
"""
class InputParams(BaseModel):
"""Input parameters for Simli video configuration.
Parameters:
max_session_length: Absolute maximum session duration in seconds.
Avatar will disconnect after this time even if it's speaking.
max_idle_time: Maximum duration in seconds the avatar is not speaking
before the avatar disconnects.
"""
max_session_length: Optional[int] = None
max_idle_time: Optional[int] = None
def __init__(
self,
simli_config: SimliConfig,
*,
api_key: Optional[str] = None,
face_id: Optional[str] = None,
simli_config: Optional[SimliConfig] = None,
use_turn_server: bool = False,
latency_interval: int = 0,
simli_url: str = "https://api.simli.ai",
is_trinity_avatar: bool = False,
params: Optional[InputParams] = None,
**kwargs,
):
"""Initialize the Simli video service.
Args:
api_key: Simli API key for authentication.
face_id: Simli Face ID. For Trinity avatars, specify "faceId/emotionId"
to use a different emotion than the default.
simli_config: Configuration object for Simli client settings.
use_turn_server: Whether to use TURN server for connection. Defaults to False.
latency_interval: Latency interval setting for sending health checks to check the latency to Simli Servers. Defaults to 0.
simli_url: URL of the simli servers. Can be changed for custom deployments of enterprise users.
is_trinity_avatar: boolean to tell simli client that this is a Trinity avatar which reduces latency when using Trinity.
Use api_key and face_id instead.
.. deprecated:: 0.0.92
The 'simli_config' parameter is deprecated and will be removed in a future version.
Please use 'api_key' and 'face_id' parameters instead.
use_turn_server: Whether to use TURN server for connection. Defaults to False.
latency_interval: Latency interval setting for sending health checks to check
the latency to Simli Servers. Defaults to 0.
simli_url: URL of the simli servers. Can be changed for custom deployments
of enterprise users.
is_trinity_avatar: Boolean to tell simli client that this is a Trinity avatar
which reduces latency when using Trinity.
params: Additional input parameters for session configuration.
**kwargs: Additional arguments passed to the parent FrameProcessor.
"""
super().__init__()
super().__init__(**kwargs)
params = params or SimliVideoService.InputParams()
# Handle deprecated simli_config parameter
if simli_config is not None:
if api_key is not None or face_id is not None:
raise ValueError(
"Cannot specify both simli_config and api_key/face_id. "
"Please use api_key and face_id (simli_config is deprecated)."
)
warnings.warn(
"The 'simli_config' parameter is deprecated and will be removed in a future version. "
"Please use 'api_key' and 'face_id' parameters instead, with optional 'params' for "
"max_session_length and max_idle_time configuration.",
DeprecationWarning,
stacklevel=2,
)
# Use the provided simli_config
config = simli_config
else:
# Validate new parameters
if api_key is None:
raise ValueError("api_key is required")
if face_id is None:
raise ValueError("face_id is required")
# Build SimliConfig from new parameters
# Only pass optional parameters if explicitly provided to use SimliConfig defaults
config_kwargs = {
"apiKey": api_key,
"faceId": face_id,
}
if params.max_session_length is not None:
config_kwargs["maxSessionLength"] = params.max_session_length
if params.max_idle_time is not None:
config_kwargs["maxIdleTime"] = params.max_idle_time
config = SimliConfig(**config_kwargs)
self._initialized = False
simli_config.maxIdleTime += 5
simli_config.maxSessionLength += 5
# Add buffer time to session limits
config.maxIdleTime += 5
config.maxSessionLength += 5
self._simli_client = SimliClient(
simli_config,
config,
use_turn_server,
latency_interval,
simliURL=simli_url,

View File

@@ -577,6 +577,7 @@ class SpeechmaticsSTTService(STTService):
),
)
logger.debug(f"{self} Connected to Speechmatics STT service")
await self._call_event_handler("on_connected")
except Exception as e:
logger.error(f"{self} Error connecting to Speechmatics: {e}")
self._client = None
@@ -595,6 +596,7 @@ class SpeechmaticsSTTService(STTService):
logger.error(f"{self} Error closing Speechmatics client: {e}")
finally:
self._client = None
await self._call_event_handler("on_disconnected")
def _process_config(self) -> None:
"""Create a formatted STT transcription config.
@@ -618,7 +620,7 @@ class SpeechmaticsSTTService(STTService):
transcription_config.additional_vocab = [
{
"content": e.content,
"sounds_like": e.sounds_like,
**({"sounds_like": e.sounds_like} if e.sounds_like else {}),
}
for e in self._params.additional_vocab
]

View File

@@ -35,6 +35,25 @@ class STTService(AIService):
Provides common functionality for STT services including audio passthrough,
muting, settings management, and audio processing. Subclasses must implement
the run_stt method to provide actual speech recognition.
Event handlers:
on_connected: Called when connected to the STT service.
on_connected: Called when disconnected from the STT service.
on_connection_error: Called when a connection to the STT service error occurs.
Example::
@stt.event_handler("on_connected")
async def on_connected(stt: STTService):
logger.debug(f"STT connected")
@stt.event_handler("on_disconnected")
async def on_disconnected(stt: STTService):
logger.debug(f"STT disconnected")
@stt.event_handler("on_connection_error")
async def on_connection_error(stt: STTService, error: str):
logger.error(f"STT connection error: {error}")
"""
def __init__(
@@ -62,6 +81,10 @@ class STTService(AIService):
self._muted: bool = False
self._user_id: str = ""
self._register_event_handler("on_connected")
self._register_event_handler("on_disconnected")
self._register_event_handler("on_connection_error")
@property
def is_muted(self) -> bool:
"""Check if the STT service is currently muted.
@@ -292,15 +315,6 @@ class WebsocketSTTService(STTService, WebsocketService):
Combines STT functionality with websocket connectivity, providing automatic
error handling and reconnection capabilities.
Event handlers:
on_connection_error: Called when a websocket connection error occurs.
Example::
@stt.event_handler("on_connection_error")
async def on_connection_error(stt: STTService, error: str):
logger.error(f"STT connection error: {error}")
"""
def __init__(self, *, reconnect_on_error: bool = True, **kwargs):
@@ -312,7 +326,6 @@ class WebsocketSTTService(STTService, WebsocketService):
"""
STTService.__init__(self, **kwargs)
WebsocketService.__init__(self, reconnect_on_error=reconnect_on_error, **kwargs)
self._register_event_handler("on_connection_error")
async def _report_error(self, error: ErrorFrame):
await self._call_event_handler("on_connection_error", error.error)

View File

@@ -59,6 +59,25 @@ class TTSService(AIService):
Provides common functionality for TTS services including text aggregation,
filtering, audio generation, and frame management. Supports configurable
sentence aggregation, silence insertion, and frame processing control.
Event handlers:
on_connected: Called when connected to the STT service.
on_connected: Called when disconnected from the STT service.
on_connection_error: Called when a connection to the STT service error occurs.
Example::
@tts.event_handler("on_connected")
async def on_connected(tts: TTSService):
logger.debug(f"TTS connected")
@tts.event_handler("on_disconnected")
async def on_disconnected(tts: TTSService):
logger.debug(f"TTS disconnected")
@tts.event_handler("on_connection_error")
async def on_connection_error(stt: TTSService, error: str):
logger.error(f"TTS connection error: {error}")
"""
def __init__(
@@ -143,6 +162,10 @@ class TTSService(AIService):
self._processing_text: bool = False
self._register_event_handler("on_connected")
self._register_event_handler("on_disconnected")
self._register_event_handler("on_connection_error")
@property
def sample_rate(self) -> int:
"""Get the current sample rate for audio output.
@@ -626,7 +649,6 @@ class WebsocketTTSService(TTSService, WebsocketService):
"""
TTSService.__init__(self, **kwargs)
WebsocketService.__init__(self, reconnect_on_error=reconnect_on_error, **kwargs)
self._register_event_handler("on_connection_error")
async def _report_error(self, error: ErrorFrame):
await self._call_event_handler("on_connection_error", error.error)
@@ -678,15 +700,6 @@ class WebsocketWordTTSService(WordTTSService, WebsocketService):
"""Base class for websocket-based TTS services that support word timestamps.
Combines word timestamp functionality with websocket connectivity.
Event handlers:
on_connection_error: Called when a websocket connection error occurs.
Example::
@tts.event_handler("on_connection_error")
async def on_connection_error(tts: TTSService, error: str):
logger.error(f"TTS connection error: {error}")
"""
def __init__(self, *, reconnect_on_error: bool = True, **kwargs):
@@ -698,7 +711,6 @@ class WebsocketWordTTSService(WordTTSService, WebsocketService):
"""
WordTTSService.__init__(self, **kwargs)
WebsocketService.__init__(self, reconnect_on_error=reconnect_on_error, **kwargs)
self._register_event_handler("on_connection_error")
async def _report_error(self, error: ErrorFrame):
await self._call_event_handler("on_connection_error", error.error)

View File

@@ -232,6 +232,9 @@ class BaseInputTransport(FrameProcessor):
"""
# Cancel and wait for the audio input task to finish.
await self._cancel_audio_task()
# Stop audio filter.
if self._params.audio_in_filter:
await self._params.audio_in_filter.stop()
async def set_transport_ready(self, frame: StartFrame):
"""Called when the transport is ready to stream.

View File

@@ -293,15 +293,15 @@ class BaseOutputTransport(FrameProcessor):
"""
await super().process_frame(frame, direction)
#
# System frames (like InterruptionFrame) are pushed immediately. Other
# frames require order so they are put in the sink queue.
#
if isinstance(frame, StartFrame):
# Push StartFrame before start(), because we want StartFrame to be
# processed by every processor before any other frame is processed.
await self.push_frame(frame, direction)
await self.start(frame)
elif isinstance(frame, EndFrame):
await self.stop(frame)
# Keep pushing EndFrame down so all the pipeline stops nicely.
await self.push_frame(frame, direction)
elif isinstance(frame, CancelFrame):
await self.cancel(frame)
await self.push_frame(frame, direction)
@@ -314,21 +314,6 @@ class BaseOutputTransport(FrameProcessor):
await self.write_dtmf(frame)
elif isinstance(frame, SystemFrame):
await self.push_frame(frame, direction)
# Control frames.
elif isinstance(frame, EndFrame):
await self.stop(frame)
# Keep pushing EndFrame down so all the pipeline stops nicely.
await self.push_frame(frame, direction)
elif isinstance(frame, MixerControlFrame):
await self._handle_frame(frame)
# Other frames.
elif isinstance(frame, OutputAudioRawFrame):
await self._handle_frame(frame)
elif isinstance(frame, (OutputImageRawFrame, SpriteFrame)):
await self._handle_frame(frame)
# TODO(aleix): Images and audio should support presentation timestamps.
elif frame.pts:
await self._handle_frame(frame)
elif direction == FrameDirection.UPSTREAM:
await self.push_frame(frame, direction)
else:
@@ -410,6 +395,13 @@ class BaseOutputTransport(FrameProcessor):
# Indicates if the bot is currently speaking.
self._bot_speaking = False
# Last time a BotSpeakingFrame was pushed.
self._bot_speaking_frame_time = 0
# How often a BotSpeakingFrame should be pushed (value should be
# lower than the audio chunks).
self._bot_speaking_frame_period = 0.2
# Last time the bot actually spoke.
self._bot_speech_last_time = 0
self._audio_task: Optional[asyncio.Task] = None
self._video_task: Optional[asyncio.Task] = None
@@ -601,39 +593,71 @@ class BaseOutputTransport(FrameProcessor):
async def _bot_started_speaking(self):
"""Handle bot started speaking event."""
if not self._bot_speaking:
logger.debug(
f"Bot{f' [{self._destination}]' if self._destination else ''} started speaking"
)
if self._bot_speaking:
return
downstream_frame = BotStartedSpeakingFrame()
downstream_frame.transport_destination = self._destination
upstream_frame = BotStartedSpeakingFrame()
upstream_frame.transport_destination = self._destination
await self._transport.push_frame(downstream_frame)
await self._transport.push_frame(upstream_frame, FrameDirection.UPSTREAM)
logger.debug(
f"Bot{f' [{self._destination}]' if self._destination else ''} started speaking"
)
self._bot_speaking = True
downstream_frame = BotStartedSpeakingFrame()
downstream_frame.transport_destination = self._destination
upstream_frame = BotStartedSpeakingFrame()
upstream_frame.transport_destination = self._destination
await self._transport.push_frame(downstream_frame)
await self._transport.push_frame(upstream_frame, FrameDirection.UPSTREAM)
self._bot_speaking = True
async def _bot_stopped_speaking(self):
"""Handle bot stopped speaking event."""
if self._bot_speaking:
logger.debug(
f"Bot{f' [{self._destination}]' if self._destination else ''} stopped speaking"
)
if not self._bot_speaking:
return
downstream_frame = BotStoppedSpeakingFrame()
downstream_frame.transport_destination = self._destination
upstream_frame = BotStoppedSpeakingFrame()
upstream_frame.transport_destination = self._destination
await self._transport.push_frame(downstream_frame)
await self._transport.push_frame(upstream_frame, FrameDirection.UPSTREAM)
logger.debug(
f"Bot{f' [{self._destination}]' if self._destination else ''} stopped speaking"
)
self._bot_speaking = False
downstream_frame = BotStoppedSpeakingFrame()
downstream_frame.transport_destination = self._destination
upstream_frame = BotStoppedSpeakingFrame()
upstream_frame.transport_destination = self._destination
await self._transport.push_frame(downstream_frame)
await self._transport.push_frame(upstream_frame, FrameDirection.UPSTREAM)
# Clean audio buffer (there could be tiny left overs if not multiple
# to our output chunk size).
self._audio_buffer = bytearray()
self._bot_speaking = False
# Clean audio buffer (there could be tiny left overs if not multiple
# to our output chunk size).
self._audio_buffer = bytearray()
async def _bot_currently_speaking(self):
"""Handle bot speaking event."""
await self._bot_started_speaking()
diff_time = time.time() - self._bot_speaking_frame_time
if diff_time >= self._bot_speaking_frame_period:
await self._transport.push_frame(BotSpeakingFrame())
await self._transport.push_frame(BotSpeakingFrame(), FrameDirection.UPSTREAM)
self._bot_speaking_frame_time = time.time()
self._bot_speech_last_time = time.time()
async def _maybe_bot_currently_speaking(self, frame: SpeechOutputAudioRawFrame):
if not is_silence(frame.audio):
await self._bot_currently_speaking()
else:
silence_duration = time.time() - self._bot_speech_last_time
if silence_duration > BOT_VAD_STOP_SECS:
await self._bot_stopped_speaking()
async def _handle_bot_speech(self, frame: Frame):
# TTS case.
if isinstance(frame, TTSAudioRawFrame):
await self._bot_currently_speaking()
# Speech stream case.
elif isinstance(frame, SpeechOutputAudioRawFrame):
await self._maybe_bot_currently_speaking(frame)
async def _handle_frame(self, frame: Frame):
"""Handle various frame types with appropriate processing.
@@ -641,7 +665,9 @@ class BaseOutputTransport(FrameProcessor):
Args:
frame: The frame to handle.
"""
if isinstance(frame, OutputImageRawFrame):
if isinstance(frame, OutputAudioRawFrame):
await self._handle_bot_speech(frame)
elif isinstance(frame, OutputImageRawFrame):
await self._set_video_image(frame)
elif isinstance(frame, SpriteFrame):
await self._set_video_images(frame.images)
@@ -705,39 +731,7 @@ class BaseOutputTransport(FrameProcessor):
async def _audio_task_handler(self):
"""Main audio processing task handler."""
# Push a BotSpeakingFrame every 200ms, we don't really need to push it
# at every audio chunk. If the audio chunk is bigger than 200ms, push at
# every audio chunk.
TOTAL_CHUNK_MS = self._params.audio_out_10ms_chunks * 10
BOT_SPEAKING_CHUNK_PERIOD = max(int(200 / TOTAL_CHUNK_MS), 1)
bot_speaking_counter = 0
speech_last_speaking_time = 0
async for frame in self._next_frame():
# Notify the bot started speaking upstream if necessary and that
# it's actually speaking.
is_speaking = False
if isinstance(frame, TTSAudioRawFrame):
is_speaking = True
elif isinstance(frame, SpeechOutputAudioRawFrame):
if not is_silence(frame.audio):
is_speaking = True
speech_last_speaking_time = time.time()
else:
silence_duration = time.time() - speech_last_speaking_time
if silence_duration > BOT_VAD_STOP_SECS:
await self._bot_stopped_speaking()
if is_speaking:
await self._bot_started_speaking()
if bot_speaking_counter % BOT_SPEAKING_CHUNK_PERIOD == 0:
await self._transport.push_frame(BotSpeakingFrame())
await self._transport.push_frame(
BotSpeakingFrame(), FrameDirection.UPSTREAM
)
bot_speaking_counter = 0
bot_speaking_counter += 1
# No need to push EndFrame, it's pushed from process_frame().
if isinstance(frame, EndFrame):
break

View File

@@ -16,7 +16,7 @@ import time
from concurrent.futures import CancelledError as FuturesCancelledError
from concurrent.futures import ThreadPoolExecutor
from dataclasses import dataclass
from typing import Any, Awaitable, Callable, Dict, Mapping, Optional
from typing import Any, Awaitable, Callable, Dict, Mapping, Optional, Tuple
import aiohttp
from loguru import logger
@@ -419,6 +419,11 @@ class DailyAudioTrack:
track: CustomAudioTrack
# This is just a type alias for the errors returned by daily-python. Right now
# they are just a string.
CallClientError = str
class DailyTransportClient(EventHandler):
"""Core client for interacting with Daily's API.
@@ -553,14 +558,17 @@ class DailyTransportClient(EventHandler):
async def send_message(
self, frame: OutputTransportMessageFrame | OutputTransportMessageUrgentFrame
):
) -> Optional[CallClientError]:
"""Send an application message to participants.
Args:
frame: The message frame to send.
Returns:
error: An error description or None.
"""
if not self._joined:
return
return "Unable to send messages before joining."
participant_id = None
if isinstance(
@@ -572,7 +580,7 @@ class DailyTransportClient(EventHandler):
self._client.send_app_message(
frame.message, participant_id, completion=completion_callback(future)
)
await future
return await future
async def read_next_audio_frame(self) -> Optional[InputAudioRawFrame]:
"""Reads the next 20ms audio frame from the virtual speaker."""
@@ -744,32 +752,24 @@ class DailyTransportClient(EventHandler):
self._client.set_user_name(self._bot_name)
try:
(data, error) = await self._join()
(data, error) = await self._join()
if not error:
self._joined = True
self._joining = False
# Increment leave counter if we successfully joined.
self._leave_counter += 1
logger.info(f"Joined {self._room_url}")
if self._params.transcription_enabled:
await self.start_transcription(self._params.transcription_settings)
await self._callbacks.on_joined(data)
self._joined_event.set()
else:
error_msg = f"Error joining {self._room_url}: {error}"
logger.error(error_msg)
await self._callbacks.on_error(error_msg)
except asyncio.TimeoutError:
error_msg = f"Time out joining {self._room_url}"
logger.error(error_msg)
if not error:
self._joined = True
self._joining = False
# Increment leave counter if we successfully joined.
self._leave_counter += 1
logger.info(f"Joined {self._room_url}")
await self._callbacks.on_joined(data)
self._joined_event.set()
else:
error_msg = f"Error joining {self._room_url}: {error}"
logger.error(error_msg)
await self._callbacks.on_error(error_msg)
self._joining = False
async def _join(self):
"""Execute the actual room join operation."""
@@ -828,7 +828,7 @@ class DailyTransportClient(EventHandler):
},
)
return await asyncio.wait_for(future, timeout=10)
return await future
async def leave(self):
"""Leave the Daily room and cleanup resources."""
@@ -847,24 +847,16 @@ class DailyTransportClient(EventHandler):
# Call callback before leaving.
await self._callbacks.on_before_leave()
if self._params.transcription_enabled:
await self.stop_transcription()
# Remove any custom tracks, if any.
for track_name, _ in self._custom_audio_tracks.items():
await self.remove_custom_audio_track(track_name)
try:
error = await self._leave()
if not error:
logger.info(f"Left {self._room_url}")
await self._callbacks.on_left()
else:
error_msg = f"Error leaving {self._room_url}: {error}"
logger.error(error_msg)
await self._callbacks.on_error(error_msg)
except asyncio.TimeoutError:
error_msg = f"Time out leaving {self._room_url}"
error = await self._leave()
if not error:
logger.info(f"Left {self._room_url}")
await self._callbacks.on_left()
else:
error_msg = f"Error leaving {self._room_url}: {error}"
logger.error(error_msg)
await self._callbacks.on_error(error_msg)
@@ -875,7 +867,7 @@ class DailyTransportClient(EventHandler):
future = self._get_event_loop().create_future()
self._client.leave(completion=completion_callback(future))
return await asyncio.wait_for(future, timeout=10)
return await future
def _cleanup(self):
"""Cleanup the Daily client instance."""
@@ -883,7 +875,7 @@ class DailyTransportClient(EventHandler):
self._client.release()
self._client = None
def participants(self):
def participants(self) -> Mapping[str, Any]:
"""Get current participants in the room.
Returns:
@@ -891,7 +883,7 @@ class DailyTransportClient(EventHandler):
"""
return self._client.participants()
def participant_counts(self):
def participant_counts(self) -> Mapping[str, Any]:
"""Get participant count information.
Returns:
@@ -899,165 +891,173 @@ class DailyTransportClient(EventHandler):
"""
return self._client.participant_counts()
async def start_dialout(self, settings):
async def start_dialout(self, settings) -> Tuple[str, Optional[CallClientError]]:
"""Start a dial-out call to a phone number.
Args:
settings: Dial-out configuration settings.
"""
logger.debug(f"Starting dialout: settings={settings}")
Returns:
session_id: Dail-out session ID.
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.start_dialout(settings, completion=completion_callback(future))
error = await future
if error:
logger.error(f"Unable to start dialout: {error}")
return await future
async def stop_dialout(self, participant_id):
async def stop_dialout(self, participant_id) -> Optional[CallClientError]:
"""Stop a dial-out call for a specific participant.
Args:
participant_id: ID of the participant to stop dial-out for.
"""
logger.debug(f"Stopping dialout: participant_id={participant_id}")
Returns:
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.stop_dialout(participant_id, completion=completion_callback(future))
error = await future
if error:
logger.error(f"Unable to stop dialout: {error}")
return await future
async def send_dtmf(self, settings):
async def send_dtmf(self, settings) -> Optional[CallClientError]:
"""Send DTMF tones during a call.
Args:
settings: DTMF settings including tones and target session.
Returns:
error: An error description or None.
"""
session_id = settings.get("sessionId") or self._dial_out_session_id
if not session_id:
logger.error("Unable to send DTMF: 'sessionId' is not set")
return
return "Can't send DTMF if 'sessionId' is not set"
# Update 'sessionId' field.
settings["sessionId"] = session_id
future = self._get_event_loop().create_future()
self._client.send_dtmf(settings, completion=completion_callback(future))
await future
return await future
async def sip_call_transfer(self, settings):
async def sip_call_transfer(self, settings) -> Optional[CallClientError]:
"""Transfer a SIP call to another destination.
Args:
settings: SIP call transfer settings.
Returns:
error: An error description or None.
"""
session_id = (
settings.get("sessionId") or self._dial_out_session_id or self._dial_in_session_id
)
if not session_id:
logger.error("Unable to transfer SIP call: 'sessionId' is not set")
return
return "Can't transfer SIP call if 'sessionId' is not set"
# Update 'sessionId' field.
settings["sessionId"] = session_id
future = self._get_event_loop().create_future()
self._client.sip_call_transfer(settings, completion=completion_callback(future))
await future
return await future
async def sip_refer(self, settings):
async def sip_refer(self, settings) -> Optional[CallClientError]:
"""Send a SIP REFER request.
Args:
settings: SIP REFER settings.
Returns:
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.sip_refer(settings, completion=completion_callback(future))
await future
return await future
async def start_recording(self, streaming_settings, stream_id, force_new):
async def start_recording(
self, streaming_settings, stream_id, force_new
) -> Tuple[str, Optional[CallClientError]]:
"""Start recording the call.
Args:
streaming_settings: Recording configuration settings.
stream_id: Unique identifier for the recording stream.
force_new: Whether to force a new recording session.
"""
logger.debug(
f"Starting recording: stream_id={stream_id} force_new={force_new} settings={streaming_settings}"
)
Returns:
stream_id: Unique identifier for the recording stream.
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.start_recording(
streaming_settings, stream_id, force_new, completion=completion_callback(future)
)
error = await future
if error:
logger.error(f"Unable to start recording: {error}")
return await future
async def stop_recording(self, stream_id):
async def stop_recording(self, stream_id) -> Optional[CallClientError]:
"""Stop recording the call.
Args:
stream_id: Unique identifier for the recording stream to stop.
"""
logger.debug(f"Stopping recording: stream_id={stream_id}")
Returns:
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.stop_recording(stream_id, completion=completion_callback(future))
error = await future
if error:
logger.error(f"Unable to stop recording: {error}")
return await future
async def start_transcription(self, settings):
async def start_transcription(self, settings) -> Optional[CallClientError]:
"""Start transcription for the call.
Args:
settings: Transcription configuration settings.
Returns:
error: An error description or None.
"""
if not self._token:
logger.warning("Transcription can't be started without a room token")
return
logger.debug(f"Starting transcription: settings={settings}")
return "Transcription can't be started without a room token"
future = self._get_event_loop().create_future()
self._client.start_transcription(
settings=self._params.transcription_settings.model_dump(exclude_none=True),
completion=completion_callback(future),
)
error = await future
if error:
logger.error(f"Unable to start transcription: {error}")
return await future
async def stop_transcription(self):
"""Stop transcription for the call."""
async def stop_transcription(self) -> Optional[CallClientError]:
"""Stop transcription for the call.
Returns:
error: An error description or None.
"""
if not self._token:
return
logger.debug(f"Stopping transcription")
return "Transcription can't be stopped without a room token"
future = self._get_event_loop().create_future()
self._client.stop_transcription(completion=completion_callback(future))
error = await future
if error:
logger.error(f"Unable to stop transcription: {error}")
return await future
async def send_prebuilt_chat_message(self, message: str, user_name: Optional[str] = None):
async def send_prebuilt_chat_message(
self, message: str, user_name: Optional[str] = None
) -> Optional[CallClientError]:
"""Send a chat message to Daily's Prebuilt main room.
Args:
message: The chat message to send.
user_name: Optional user name that will appear as sender of the message.
Returns:
error: An error description or None.
"""
if not self._joined:
return
return "Can't send message if not joined"
future = self._get_event_loop().create_future()
self._client.send_prebuilt_chat_message(
message, user_name=user_name, completion=completion_callback(future)
)
await future
return await future
async def capture_participant_transcription(self, participant_id: str):
"""Enable transcription capture for a specific participant.
@@ -1177,38 +1177,51 @@ class DailyTransportClient(EventHandler):
return track
async def remove_custom_audio_track(self, track_name: str):
async def remove_custom_audio_track(self, track_name: str) -> Optional[CallClientError]:
"""Remove a custom audio track.
Args:
track_name: Name of the custom audio track to remove.
Returns:
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.remove_custom_audio_track(
track_name=track_name,
completion=completion_callback(future),
)
await future
return await future
async def update_transcription(self, participants=None, instance_id=None):
async def update_transcription(
self, participants=None, instance_id=None
) -> Optional[CallClientError]:
"""Update transcription settings for specific participants.
Args:
participants: List of participant IDs to enable transcription for.
instance_id: Optional transcription instance ID.
Returns:
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.update_transcription(
participants, instance_id, completion=completion_callback(future)
)
await future
return await future
async def update_subscriptions(self, participant_settings=None, profile_settings=None):
async def update_subscriptions(
self, participant_settings=None, profile_settings=None
) -> Optional[CallClientError]:
"""Update media subscription settings.
Args:
participant_settings: Per-participant subscription settings.
profile_settings: Global subscription profile settings.
Returns:
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.update_subscriptions(
@@ -1216,32 +1229,42 @@ class DailyTransportClient(EventHandler):
profile_settings=profile_settings,
completion=completion_callback(future),
)
await future
return await future
async def update_publishing(self, publishing_settings: Mapping[str, Any]):
async def update_publishing(
self, publishing_settings: Mapping[str, Any]
) -> Optional[CallClientError]:
"""Update media publishing settings.
Args:
publishing_settings: Publishing configuration settings.
Returns:
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.update_publishing(
publishing_settings=publishing_settings,
completion=completion_callback(future),
)
await future
return await future
async def update_remote_participants(self, remote_participants: Mapping[str, Any]):
async def update_remote_participants(
self, remote_participants: Mapping[str, Any]
) -> Optional[CallClientError]:
"""Update settings for remote participants.
Args:
remote_participants: Remote participant configuration settings.
Returns:
error: An error description or None.
"""
future = self._get_event_loop().create_future()
self._client.update_remote_participants(
remote_participants=remote_participants, completion=completion_callback(future)
)
await future
return await future
#
#
@@ -1932,7 +1955,9 @@ class DailyOutputTransport(BaseOutputTransport):
Args:
frame: The transport message frame to send.
"""
await self._client.send_message(frame)
error = await self._client.send_message(frame)
if error:
logger.error(f"Unable to send message: {error}")
async def register_video_destination(self, destination: str):
"""Register a video output destination.
@@ -2176,7 +2201,7 @@ class DailyTransport(BaseTransport):
if self._output:
await self._output.queue_frame(frame, FrameDirection.DOWNSTREAM)
def participants(self):
def participants(self) -> Mapping[str, Any]:
"""Get current participants in the room.
Returns:
@@ -2184,7 +2209,7 @@ class DailyTransport(BaseTransport):
"""
return self._client.participants()
def participant_counts(self):
def participant_counts(self) -> Mapping[str, Any]:
"""Get participant count information.
Returns:
@@ -2192,76 +2217,155 @@ class DailyTransport(BaseTransport):
"""
return self._client.participant_counts()
async def start_dialout(self, settings=None):
async def start_dialout(self, settings=None) -> Tuple[str, Optional[CallClientError]]:
"""Start a dial-out call to a phone number.
Args:
settings: Dial-out configuration settings.
"""
await self._client.start_dialout(settings)
async def stop_dialout(self, participant_id):
Returns:
session_id: Dail-out session ID.
error: An error description or None.
"""
logger.debug(f"Starting dialout: settings={settings}")
session_id, error = await self._client.start_dialout(settings)
if error:
logger.error(f"Unable to start dialout: {error}")
return session_id, error
async def stop_dialout(self, participant_id) -> Optional[CallClientError]:
"""Stop a dial-out call for a specific participant.
Args:
participant_id: ID of the participant to stop dial-out for.
"""
await self._client.stop_dialout(participant_id)
async def sip_call_transfer(self, settings):
Returns:
error: An error description or None.
"""
logger.debug(f"Stopping dialout: participant_id={participant_id}")
error = await self._client.stop_dialout(participant_id)
if error:
logger.error(f"Unable to stop dialout: {error}")
return error
async def sip_call_transfer(self, settings) -> Optional[CallClientError]:
"""Transfer a SIP call to another destination.
Args:
settings: SIP call transfer settings.
"""
await self._client.sip_call_transfer(settings)
async def sip_refer(self, settings):
Returns:
error: An error description or None.
"""
logger.debug(f"Staring SIP call transfer: settings={settings}")
error = await self._client.sip_call_transfer(settings)
if error:
logger.error(f"Unable to transfer SIP call: {error}")
return error
async def sip_refer(self, settings) -> Optional[CallClientError]:
"""Send a SIP REFER request.
Args:
settings: SIP REFER settings.
"""
await self._client.sip_refer(settings)
async def start_recording(self, streaming_settings=None, stream_id=None, force_new=None):
Returns:
error: An error description or None.
"""
logger.debug(f"Staring SIP REFER: settings={settings}")
error = await self._client.sip_refer(settings)
if error:
logger.error(f"Unable to perform SIP REFER: {error}")
return error
async def start_recording(
self, streaming_settings=None, stream_id=None, force_new=None
) -> Tuple[str, Optional[CallClientError]]:
"""Start recording the call.
Args:
streaming_settings: Recording configuration settings.
stream_id: Unique identifier for the recording stream.
force_new: Whether to force a new recording session.
"""
await self._client.start_recording(streaming_settings, stream_id, force_new)
async def stop_recording(self, stream_id=None):
Returns:
stream_id: Unique identifier for the recording stream.
error: An error description or None.
"""
logger.debug(
f"Starting recording: stream_id={stream_id} force_new={force_new} settings={streaming_settings}"
)
r_id, error = await self._client.start_recording(streaming_settings, stream_id, force_new)
if error:
logger.error(f"Unable to start recording: {error}")
return r_id, error
async def stop_recording(self, stream_id=None) -> Optional[CallClientError]:
"""Stop recording the call.
Args:
stream_id: Unique identifier for the recording stream to stop.
"""
await self._client.stop_recording(stream_id)
async def start_transcription(self, settings=None):
Returns:
error: An error description or None.
"""
logger.debug(f"Stopping recording: stream_id={stream_id}")
error = await self._client.stop_recording(stream_id)
if error:
logger.error(f"Unable to stop recording: {error}")
return error
async def start_transcription(self, settings=None) -> Optional[CallClientError]:
"""Start transcription for the call.
Args:
settings: Transcription configuration settings.
Returns:
error: An error description or None.
"""
await self._client.start_transcription(settings)
logger.debug(f"Starting transcription: settings={settings}")
async def stop_transcription(self):
"""Stop transcription for the call."""
await self._client.stop_transcription()
error = await self._client.start_transcription(settings)
if error:
logger.error(f"Unable to start transcription: {error}")
return error
async def send_prebuilt_chat_message(self, message: str, user_name: Optional[str] = None):
async def stop_transcription(self) -> Optional[CallClientError]:
"""Stop transcription for the call.
Returns:
error: An error description or None.
"""
logger.debug(f"Stopping transcription")
error = await self._client.stop_transcription()
if error:
logger.error(f"Unable to stop transcription: {error}")
return error
async def send_prebuilt_chat_message(
self, message: str, user_name: Optional[str] = None
) -> Optional[CallClientError]:
"""Send a chat message to Daily's Prebuilt main room.
Args:
message: The chat message to send.
user_name: Optional user name that will appear as sender of the message.
Returns:
error: An error description or None.
"""
await self._client.send_prebuilt_chat_message(message, user_name)
error = await self._client.send_prebuilt_chat_message(message, user_name)
if error:
logger.error(f"Unable to send prebuilt chat message: {error}")
return error
async def capture_participant_transcription(self, participant_id: str):
"""Enable transcription capture for a specific participant.
@@ -2307,32 +2411,66 @@ class DailyTransport(BaseTransport):
participant_id, framerate, video_source, color_format
)
async def update_publishing(self, publishing_settings: Mapping[str, Any]):
async def update_publishing(
self, publishing_settings: Mapping[str, Any]
) -> Optional[CallClientError]:
"""Update media publishing settings.
Args:
publishing_settings: Publishing configuration settings.
"""
await self._client.update_publishing(publishing_settings=publishing_settings)
async def update_subscriptions(self, participant_settings=None, profile_settings=None):
Returns:
error: An error description or None.
"""
logger.debug(f"Updating publishing settings: settings={publishing_settings}")
error = await self._client.update_publishing(publishing_settings=publishing_settings)
if error:
logger.error(f"Unable to update publishing settings: {error}")
return error
async def update_subscriptions(
self, participant_settings=None, profile_settings=None
) -> Optional[CallClientError]:
"""Update media subscription settings.
Args:
participant_settings: Per-participant subscription settings.
profile_settings: Global subscription profile settings.
Returns:
error: An error description or None.
"""
await self._client.update_subscriptions(
participant_settings=participant_settings, profile_settings=profile_settings
logger.debug(
f"Updating subscriptions: participant_settings={participant_settings} profile_settings={profile_settings}"
)
async def update_remote_participants(self, remote_participants: Mapping[str, Any]):
error = await self._client.update_subscriptions(
participant_settings=participant_settings, profile_settings=profile_settings
)
if error:
logger.error(f"Unable to update subscription settings: {error}")
return error
async def update_remote_participants(
self, remote_participants: Mapping[str, Any]
) -> Optional[CallClientError]:
"""Update settings for remote participants.
Args:
remote_participants: Remote participant configuration settings.
Returns:
error: An error description or None.
"""
await self._client.update_remote_participants(remote_participants=remote_participants)
logger.debug(f"Updating remote participants: remote_participants={remote_participants}")
error = await self._client.update_remote_participants(
remote_participants=remote_participants
)
if error:
logger.error(f"Unable to update remote participants: {error}")
return error
async def _on_active_speaker_changed(self, participant: Any):
"""Handle active speaker change events."""
@@ -2340,6 +2478,12 @@ class DailyTransport(BaseTransport):
async def _on_joined(self, data):
"""Handle room joined events."""
if self._params.transcription_enabled:
# We report an error because we are starting transcription
# internally and if it fails we need to know.
error = await self.start_transcription(self._params.transcription_settings)
if error:
await self._on_error(f"Unable to start transcription: {error}")
await self._call_event_handler("on_joined", data)
async def _on_left(self):
@@ -2348,6 +2492,12 @@ class DailyTransport(BaseTransport):
async def _on_before_leave(self):
"""Handle before leave room events."""
if self._params.transcription_enabled:
# We report an error because we are stopping transcription
# internally and if it fails we need to know.
error = await self.stop_transcription()
if error:
await self._on_error(f"Unable to stop transcription: {error}")
await self._call_event_handler("on_before_leave")
async def _on_error(self, error):

View File

@@ -689,3 +689,8 @@ class SmallWebRTCConnection(BaseObject):
)()
if track:
track.set_enabled(signalling_message.enabled)
async def add_ice_candidate(self, candidate):
"""Handle incoming ICE candidates."""
logger.debug(f"Adding remote candidate: {candidate}")
await self.pc.addIceCandidate(candidate)

View File

@@ -14,6 +14,7 @@ from dataclasses import dataclass
from enum import Enum
from typing import Any, Awaitable, Callable, Dict, List, Optional
from aiortc.sdp import candidate_from_sdp
from fastapi import HTTPException
from loguru import logger
@@ -39,6 +40,34 @@ class SmallWebRTCRequest:
request_data: Optional[Any] = None
@dataclass
class IceCandidate:
"""The remote ice candidate object received from the peer connection.
Parameters:
candidate: The ice candidate patch SDP string (Session Description Protocol).
sdp_mid: The SDP mid for the candidate patch.
sdp_mline_index: The SDP mline index for the candidate patch.
"""
candidate: str
sdp_mid: str
sdp_mline_index: int
@dataclass
class SmallWebRTCPatchRequest:
"""Small WebRTC transport session arguments for the runner.
Parameters:
pc_id: Identifier for the peer connection.
candidates: A list of ICE candidate patches.
"""
pc_id: str
candidates: List[IceCandidate]
class ConnectionMode(Enum):
"""Enum defining the connection handling modes."""
@@ -197,6 +226,19 @@ class SmallWebRTCRequestHandler:
logger.debug(f"SmallWebRTC request details: {request}")
raise
async def handle_patch_request(self, request: SmallWebRTCPatchRequest):
"""Handle a SmallWebRTC patch candidate request."""
peer_connection = self._pcs_map.get(request.pc_id)
if not peer_connection:
raise HTTPException(status_code=404, detail="Peer connection not found")
for c in request.candidates:
candidate = candidate_from_sdp(c.candidate)
candidate.sdpMid = c.sdp_mid
candidate.sdpMLineIndex = c.sdp_mline_index
await peer_connection.add_ice_candidate(candidate)
async def close(self):
"""Clear the connection map."""
coros = [pc.disconnect() for pc in self._pcs_map.values()]

View File

@@ -47,6 +47,7 @@ SENTENCE_ENDING_PUNCTUATION: FrozenSet[str] = frozenset(
"!",
"?",
";",
"",
# East Asian punctuation (Chinese (Traditional & Simplified), Japanese, Korean)
"", # Ideographic full stop
"", # Full-width question mark

View File

@@ -905,7 +905,9 @@ def traced_openai_realtime(operation: str) -> Callable:
# Capture context messages being sent
if hasattr(self, "_context") and self._context:
try:
messages = self._context.get_messages_for_logging()
messages = self.get_llm_adapter().get_messages_for_logging(
self._context
)
if messages:
operation_attrs["context_messages"] = json.dumps(messages)
except Exception as e:

View File

@@ -254,7 +254,7 @@ class TestPipelineTask(unittest.IsolatedAsyncioTestCase):
try:
await asyncio.wait_for(
asyncio.shield(task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))),
task.run(PipelineTaskParams(loop=asyncio.get_event_loop())),
timeout=1.0,
)
except asyncio.TimeoutError:
@@ -290,7 +290,7 @@ class TestPipelineTask(unittest.IsolatedAsyncioTestCase):
await task.queue_frame(TextFrame(text="Hello!"))
try:
await asyncio.wait_for(
asyncio.shield(task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))),
task.run(PipelineTaskParams(loop=asyncio.get_event_loop())),
timeout=1.0,
)
except asyncio.TimeoutError:
@@ -301,11 +301,8 @@ class TestPipelineTask(unittest.IsolatedAsyncioTestCase):
identity = IdentityFilter()
pipeline = Pipeline([identity])
task = PipelineTask(pipeline, idle_timeout_secs=0.2)
try:
await task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))
assert False
except asyncio.CancelledError:
assert True
# This shouldn't freeze, so nothing to check really.
await task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))
async def test_no_idle_task(self):
identity = IdentityFilter()
@@ -313,7 +310,7 @@ class TestPipelineTask(unittest.IsolatedAsyncioTestCase):
task = PipelineTask(pipeline, idle_timeout_secs=0.2, cancel_on_idle_timeout=False)
try:
await asyncio.wait_for(
asyncio.shield(task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))),
task.run(PipelineTaskParams(loop=asyncio.get_event_loop())),
timeout=0.3,
)
except asyncio.TimeoutError:
@@ -332,11 +329,7 @@ class TestPipelineTask(unittest.IsolatedAsyncioTestCase):
),
idle_timeout_secs=0.3,
)
try:
await task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))
assert False
except asyncio.CancelledError:
assert True
await task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))
async def test_idle_task_event_handler_no_frames(self):
identity = IdentityFilter()
@@ -351,11 +344,8 @@ class TestPipelineTask(unittest.IsolatedAsyncioTestCase):
idle_timeout = True
await task.cancel()
try:
await task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))
assert False
except asyncio.CancelledError:
assert idle_timeout
await task.run(PipelineTaskParams(loop=asyncio.get_event_loop()))
assert idle_timeout
async def test_idle_task_event_handler_quiet_user(self):
identity = IdentityFilter()
@@ -416,12 +406,15 @@ class TestPipelineTask(unittest.IsolatedAsyncioTestCase):
asyncio.create_task(delayed_frames()),
]
await asyncio.wait(tasks, return_when=asyncio.FIRST_COMPLETED)
_, pending = await asyncio.wait(tasks, return_when=asyncio.FIRST_COMPLETED)
diff_time = time.time() - start_time
self.assertGreater(diff_time, sleep_time_secs * 3)
# Wait for the pending tasks to complete.
await asyncio.gather(*pending)
async def test_task_cancel_timeout(self):
class CancelFilter(FrameProcessor):
def __init__(self, **kwargs):

View File

@@ -7,10 +7,12 @@
"""Unit tests for ServiceSwitcher and related components."""
import unittest
from dataclasses import dataclass
from pipecat.frames.frames import (
Frame,
ManuallySwitchServiceFrame,
SystemFrame,
TextFrame,
)
from pipecat.pipeline.pipeline import Pipeline
@@ -52,6 +54,13 @@ class MockFrameProcessor(FrameProcessor):
self.frame_count = 0
@dataclass
class DummySystemFrame(SystemFrame):
"""A dummy system frame for testing purposes."""
text: str = ""
class TestServiceSwitcherStrategyManual(unittest.IsolatedAsyncioTestCase):
"""Test cases for ServiceSwitcherStrategyManual."""
@@ -140,14 +149,22 @@ class TestServiceSwitcher(unittest.IsolatedAsyncioTestCase):
# Send some test frames
frames_to_send = [
TextFrame(text="Hello 1"),
DummySystemFrame(text="System Message 1"),
TextFrame(text="Hello 2"),
DummySystemFrame(text="System Message 2"),
TextFrame(text="Hello 3"),
]
await run_test(
switcher,
frames_to_send=frames_to_send,
expected_down_frames=[TextFrame, TextFrame, TextFrame],
expected_down_frames=[
DummySystemFrame,
DummySystemFrame,
TextFrame,
TextFrame,
TextFrame,
],
expected_up_frames=[], # Expect no error frames
)
@@ -156,7 +173,13 @@ class TestServiceSwitcher(unittest.IsolatedAsyncioTestCase):
text_frames = [f for f in self.service1.processed_frames if isinstance(f, TextFrame)]
self.assertEqual(len(text_frames), 3)
# Check that other services don't receive text frames (they might get StartFrame/EndFrame)
# Only service1 should have processed the system frames
system_frames = [
f for f in self.service1.processed_frames if isinstance(f, DummySystemFrame)
]
self.assertEqual(len(system_frames), 2)
# Check that other services don't receive text frames (they still get StartFrame/EndFrame)
service2_text_frames = [
f for f in self.service2.processed_frames if isinstance(f, TextFrame)
]
@@ -166,10 +189,24 @@ class TestServiceSwitcher(unittest.IsolatedAsyncioTestCase):
self.assertEqual(len(service2_text_frames), 0)
self.assertEqual(len(service3_text_frames), 0)
# Check that other services don't receive dummy system frames (they still get StartFrame/EndFrame)
service2_system_frames = [
f for f in self.service2.processed_frames if isinstance(f, DummySystemFrame)
]
service3_system_frames = [
f for f in self.service3.processed_frames if isinstance(f, DummySystemFrame)
]
self.assertEqual(len(service2_system_frames), 0)
self.assertEqual(len(service3_system_frames), 0)
# Verify the actual text frames processed
for i, frame in enumerate(text_frames):
self.assertEqual(frame.text, f"Hello {i + 1}")
# Verify the actual system frames processed
for i, frame in enumerate(system_frames):
self.assertEqual(frame.text, f"System Message {i + 1}")
async def test_service_switching(self):
"""Test that after service switching using ManuallySwitchServiceFrame, the new active service receives frames while others don't."""
switcher = ServiceSwitcher(self.services, ServiceSwitcherStrategyManual)

423
uv.lock generated
View File

@@ -410,16 +410,16 @@ wheels = [
[[package]]
name = "aws-sdk-bedrock-runtime"
version = "0.1.0"
version = "0.1.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "smithy-aws-core", extra = ["eventstream", "json"], marker = "python_full_version >= '3.12'" },
{ name = "smithy-core", marker = "python_full_version >= '3.12'" },
{ name = "smithy-http", extra = ["awscrt"], marker = "python_full_version >= '3.12'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/84/e1/39971b907c83a7525bab112c9b395e1bb6d4bc23bc1712d6d7a050662217/aws_sdk_bedrock_runtime-0.1.0.tar.gz", hash = "sha256:bd062de5a48404f64e1dfe6fb8841fbbf68e8f1798c357d14eb427274cb96a2b", size = 85419, upload-time = "2025-09-29T19:40:01.855Z" }
sdist = { url = "https://files.pythonhosted.org/packages/1d/78/48574454b3cac869df67665e4a403ebfc3abfcfba2c2ff01ccfd67d55f8f/aws_sdk_bedrock_runtime-0.1.1.tar.gz", hash = "sha256:c896f99e675c3a1ab600633a07b785f3dc9fe8ab94f640b1f992b63da2dfc784", size = 82446, upload-time = "2025-10-21T20:25:25.845Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/3d/e1/5b36bffe85010cdcd44730d1c2d5244653d57c002f440141d7fc3b9f1347/aws_sdk_bedrock_runtime-0.1.0-py3-none-any.whl", hash = "sha256:aac6ff47069d456ca5e23083d96a01e3e0cbc215414e6753c289d7d9efef3335", size = 78853, upload-time = "2025-09-29T19:40:00.341Z" },
{ url = "https://files.pythonhosted.org/packages/83/07/62c0b70223d178c138f29124ac2f7973a6ba803abc7735b6a01a85217f3d/aws_sdk_bedrock_runtime-0.1.1-py3-none-any.whl", hash = "sha256:c0336b377b2112cf88197d3d44302fbeb3efb1101989fa49ae55e78f49cfe345", size = 74954, upload-time = "2025-10-21T20:25:24.973Z" },
]
[[package]]
@@ -433,31 +433,31 @@ wheels = [
[[package]]
name = "awscrt"
version = "0.28.1"
version = "0.28.2"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/a0/1c/5c9e6a7375c2a1355aadeb2d06c96c95934ec37ff29ebaab2919f59c3ff1/awscrt-0.28.1.tar.gz", hash = "sha256:70a28fd6ff3e0abb7854ea8a9133bc9e5de681a0d9bdbd8a599a23d13a448685", size = 37956730, upload-time = "2025-09-19T00:58:31.564Z" }
sdist = { url = "https://files.pythonhosted.org/packages/d4/1b/a885a699217967c3ff0e1c49ac5b1e2a050d1a8b87d1e85e958a56e3d3f5/awscrt-0.28.2.tar.gz", hash = "sha256:9715a888f2042e710dc8aeb355963a29b77e7a4cc25a14659cebd21a5fa476c1", size = 37894849, upload-time = "2025-10-14T19:06:16.867Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/2d/75/dd62276f2907a9ffcf9f8f780c08ce9938bd0550a15c887db198b47f24d3/awscrt-0.28.1-cp310-cp310-macosx_10_15_universal2.whl", hash = "sha256:47f885104065918d311102e2b08b943966717c0f3b0c5de5908d2fd08de32198", size = 3376838, upload-time = "2025-09-19T00:57:32.988Z" },
{ url = "https://files.pythonhosted.org/packages/a7/93/562709cdf13a7606548426ecc31326ba3f6839f91e98a1e9230208308afb/awscrt-0.28.1-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:3df2316e77ad88c456b7eb2c9928007d379ed892154c1969d35b98653617e576", size = 3821522, upload-time = "2025-09-19T00:57:35.456Z" },
{ url = "https://files.pythonhosted.org/packages/43/f0/6c6ff81f5a4c6d085eb450854149087bf9240c37c467c747521f47901b32/awscrt-0.28.1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:3a060d930939f142345f46a344e19ffc0dada657b04d02216b8adffba550c0a0", size = 4087344, upload-time = "2025-09-19T00:57:36.62Z" },
{ url = "https://files.pythonhosted.org/packages/37/0a/71c097505add4ceea4ac05153311715acb7489cd82ec69db4570130f4698/awscrt-0.28.1-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:43f81ca6bfe85c38ad9765605aaaa646a1ed6fd7210dbedf67c113dd245f425e", size = 3745148, upload-time = "2025-09-19T00:57:38Z" },
{ url = "https://files.pythonhosted.org/packages/79/1b/2b02b705a47b64e6c4d401087ddd30d4ad9af70172812ae8c62fb2b7a70c/awscrt-0.28.1-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:fc8e2307d9dbe76842015a14701ff7e9cf2619d674621b2d55b769414e17b3fc", size = 3972439, upload-time = "2025-09-19T00:57:39.74Z" },
{ url = "https://files.pythonhosted.org/packages/f1/19/429c81c7a0d81a5edce9cc6d9a878c8b65d8b5b69fa5a2725a6e0b1380c1/awscrt-0.28.1-cp310-cp310-win32.whl", hash = "sha256:6e7b094587e5332d428300340dcc18794a1fcfa76d636f216fc0f5c8405ba604", size = 3915231, upload-time = "2025-09-19T00:57:41.096Z" },
{ url = "https://files.pythonhosted.org/packages/83/81/769ad51fc6dcfd8bf9e0aa59c252013da0eb9e32c050ecbd1fc25f71689a/awscrt-0.28.1-cp310-cp310-win_amd64.whl", hash = "sha256:ac02f10f7384fdb68187f8d5d94743a271b16fa94be81481ce7684942f6a4b35", size = 4051668, upload-time = "2025-09-19T00:57:42.696Z" },
{ url = "https://files.pythonhosted.org/packages/9e/55/0ee537d146f24d6e76eaf02d462a83c572788233603bb9bda969fbf23307/awscrt-0.28.1-cp311-abi3-macosx_10_15_universal2.whl", hash = "sha256:cb36052f9aa34e77687a8037559bbea331fc9d5d77cd71ab0cf4e6d72af73f72", size = 3376673, upload-time = "2025-09-19T00:57:43.875Z" },
{ url = "https://files.pythonhosted.org/packages/f0/54/12700a4b9545680baa3e2d4d0e543bb4775a639df56ee51cbb29b71e0947/awscrt-0.28.1-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:fc59829152a5806eb2708aca5c5084c11dd18ecbe765e03eb314d5a360eeaa62", size = 3782870, upload-time = "2025-09-19T00:57:45.737Z" },
{ url = "https://files.pythonhosted.org/packages/1d/e7/7b189ace9e187b9b55ed4a6ec9a451579b2f16bd01d402f79a19cc8e1603/awscrt-0.28.1-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d2f20bc774599b9d85ce66689415da529ddd1d2215da818e005deedc4688fe61", size = 4048789, upload-time = "2025-09-19T00:57:47.327Z" },
{ url = "https://files.pythonhosted.org/packages/9c/e0/2e5472019906dfcc5fadcdba4bad9e69dabb95bbc0c110cfe555ee8461dc/awscrt-0.28.1-cp311-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:491b8b9c73a288cfd5e0cbdac16aabb5313d5cfc33bbe461763a5ddc26624f70", size = 3687832, upload-time = "2025-09-19T00:57:48.563Z" },
{ url = "https://files.pythonhosted.org/packages/71/f2/7e05d371bb888ee9f15e83d189287838f7b6ea40dfc91eacb3acd24b8529/awscrt-0.28.1-cp311-abi3-musllinux_1_1_x86_64.whl", hash = "sha256:4c6c7125b7e9fcc999eb685d1cace8d4f2ffc63f8f3d8ef7f77e1a97d9552863", size = 3913378, upload-time = "2025-09-19T00:57:50.185Z" },
{ url = "https://files.pythonhosted.org/packages/79/6b/a542a65a22edb85d64742970c21721e66e0f9f67911a11c7a5c3626a1b17/awscrt-0.28.1-cp311-abi3-win32.whl", hash = "sha256:1dcb33d7cf8f69881ac6ef75a5b9b40816be58678b1bb07ccbe0230281bdbc81", size = 3912809, upload-time = "2025-09-19T00:57:51.797Z" },
{ url = "https://files.pythonhosted.org/packages/df/64/16cc8a0011e3ca5dda13605befa7e6db29bfb3073c67f6e8dad90be0a8ae/awscrt-0.28.1-cp311-abi3-win_amd64.whl", hash = "sha256:670caaf556876913bcfb9d8183d43d67a6c7b52998f2f398abd1c21632a006f8", size = 4048979, upload-time = "2025-09-19T00:57:53.061Z" },
{ url = "https://files.pythonhosted.org/packages/ca/ac/debbd3a2f03c5953b56b1c3b321bab16293f857ea3005e3f7e5dded5e0b2/awscrt-0.28.1-cp313-abi3-macosx_10_15_universal2.whl", hash = "sha256:22311d25135b937ee5617e35a6554961727527dcfa3e06efdefe187a6abe65c4", size = 3375565, upload-time = "2025-09-19T00:57:54.598Z" },
{ url = "https://files.pythonhosted.org/packages/ea/4f/9388917ad45c043acd7c4ab2c28b9e2b5ddf29e21a82bfc01a7626c18c04/awscrt-0.28.1-cp313-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:e58740cf0e41552fdf7909e10814b312ab090ebe54741354a61507e0c6d4ebfd", size = 3775366, upload-time = "2025-09-19T00:57:56.238Z" },
{ url = "https://files.pythonhosted.org/packages/8a/e3/3ef301cdef76b22ce14b041e04c6cf65ba4491d00e9f5b400c0699f6c63e/awscrt-0.28.1-cp313-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:9e69f163a207a8b172abbfea1f51045301ed1ac8bbaf76958a6b5e81d72e5b89", size = 4043403, upload-time = "2025-09-19T00:57:57.4Z" },
{ url = "https://files.pythonhosted.org/packages/60/9c/4f89922333724c4da851752549ca97dd147420734ef6c4ece56d5dd65e09/awscrt-0.28.1-cp313-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:592f4b234ecafa6cde86e55e42c4fe84c4e1ffe9fb11b0a8b8f0ffb8c62fa2cc", size = 3678742, upload-time = "2025-09-19T00:57:59.055Z" },
{ url = "https://files.pythonhosted.org/packages/0e/d4/adb97ba5f888ed201aa1f9e9f8d6cfc0dbaf80f0e937b3acb7411febdaa8/awscrt-0.28.1-cp313-abi3-musllinux_1_1_x86_64.whl", hash = "sha256:b16321f1d2bf5b4991a213059c1b5dc07954edfc424d154b093824465ec94ce2", size = 3908438, upload-time = "2025-09-19T00:58:00.71Z" },
{ url = "https://files.pythonhosted.org/packages/41/ac/600ea0a6f4ba6543c50417c8e78b09f2cd73dd0f0d4c3e9e52220a8badbe/awscrt-0.28.1-cp313-abi3-win32.whl", hash = "sha256:3e0a23635aa75b4af163ff9bf5a0873928369b1ac32c8b1351741a95472ccf71", size = 3907625, upload-time = "2025-09-19T00:58:03.235Z" },
{ url = "https://files.pythonhosted.org/packages/9e/24/d22c7197b1e53c76b5eb71d640a4728b9b7621075d8dbcc054e16b5b98f0/awscrt-0.28.1-cp313-abi3-win_amd64.whl", hash = "sha256:9849c88ca0830396724acf988e2759895118fe7dd2a23dab21978c8600d01a11", size = 4043878, upload-time = "2025-09-19T00:58:04.595Z" },
{ url = "https://files.pythonhosted.org/packages/73/b4/1a566e493bdfa6e918ba78bcd2e45dda99a25407a4fd974db2666228d154/awscrt-0.28.2-cp310-cp310-macosx_10_15_universal2.whl", hash = "sha256:bec19c0dd780293a26c809aabb9f7675b28cb3a1bf05b4a5bc9f28d5ced75a81", size = 3380735, upload-time = "2025-10-14T19:05:16.58Z" },
{ url = "https://files.pythonhosted.org/packages/1f/53/6602a87aead1d413c7bd77d059b301745146635cda99ee2a61ec0d23691e/awscrt-0.28.2-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:01f33076759ba6285f25ccc6016355607df2e715d0bab3a1ef2416b87a6c3ade", size = 3827084, upload-time = "2025-10-14T19:05:19.335Z" },
{ url = "https://files.pythonhosted.org/packages/d8/62/61fe39ae5950ad00e10dcbf6e4f4f344dc93957757160c0000390331a11b/awscrt-0.28.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:2b5c807b9972795ce54c05aea6918c60983c51d879ebbff7a67adb8b0d28a121", size = 4092678, upload-time = "2025-10-14T19:05:20.8Z" },
{ url = "https://files.pythonhosted.org/packages/25/7d/e38f18cfb203e8f09842c0e3f422992887ce285ecc3bf18816d559a13c80/awscrt-0.28.2-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:bf4ff9c8c6a233246320c2d41d939b6e25cdae97728d827186e4771a9edda688", size = 3749978, upload-time = "2025-10-14T19:05:22.16Z" },
{ url = "https://files.pythonhosted.org/packages/16/6f/e8a3c0daed8f7b60c76fc2721bd4e83580ddecace24e0cb0ebb99564f699/awscrt-0.28.2-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:0c738b83b66d1a8b43089556247fbe4adf2b73d610c7938d3bae1718a0fe8b1d", size = 3977237, upload-time = "2025-10-14T19:05:23.368Z" },
{ url = "https://files.pythonhosted.org/packages/92/3d/8400203f02dd924bcc8255703179b0c26efd03c84f838db6f026fcef9ba6/awscrt-0.28.2-cp310-cp310-win32.whl", hash = "sha256:23c30004c736a2f826a32c9720f1ccf71e8e4deb8535da5915d6073604853098", size = 3919413, upload-time = "2025-10-14T19:05:24.477Z" },
{ url = "https://files.pythonhosted.org/packages/c0/5e/b5ccf377880a70425b100f1e5f5ba516ff75e291585b3dc129239fbd1ec3/awscrt-0.28.2-cp310-cp310-win_amd64.whl", hash = "sha256:859ae8a195d51f15b631147d6792953a563bfe0a1cc7a75b6750977634de54b8", size = 4056024, upload-time = "2025-10-14T19:05:25.956Z" },
{ url = "https://files.pythonhosted.org/packages/ed/79/94e9f0ee7c60ec6233c7ad6293589c56d5145172e49eb5328eda37d3fdd1/awscrt-0.28.2-cp311-abi3-macosx_10_15_universal2.whl", hash = "sha256:025eab99b58586d8c95f8fafe1f4695ad477eda20d1207240ee4f8ee79742059", size = 3381061, upload-time = "2025-10-14T19:05:27.187Z" },
{ url = "https://files.pythonhosted.org/packages/2d/b8/0da80dd58682ddf3ec204e877d5891198654647c085e65b6b8eacd214edb/awscrt-0.28.2-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:e5c18d035d6cd92228e1db2f043517c1bcf9e0f6430c0af60cc34257dcca092c", size = 3788011, upload-time = "2025-10-14T19:05:28.768Z" },
{ url = "https://files.pythonhosted.org/packages/d6/d2/f51cf4364364399fe90d557e2fed14c1f114720191a5825898b1242bd607/awscrt-0.28.2-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:c75f077e90d0220a49b75a9bca914e5aa1a3c8f28af6bce4d0332be0b98dd3cb", size = 4055226, upload-time = "2025-10-14T19:05:30.054Z" },
{ url = "https://files.pythonhosted.org/packages/41/47/0fde8738a8c76de278ce431d8468ef18aeaca424329decca9ad5092df812/awscrt-0.28.2-cp311-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:1432c5c59a7e36b33eb2746cfbf30058f19ed43f2c117863897681f70bc246ba", size = 3692839, upload-time = "2025-10-14T19:05:31.471Z" },
{ url = "https://files.pythonhosted.org/packages/18/25/cb3762f6b47fe503eea7f337eca7cfd044ab28bcc2452fbf298c6492ec8b/awscrt-0.28.2-cp311-abi3-musllinux_1_1_x86_64.whl", hash = "sha256:f96703c30b22ba1e43e1bb2fe996ac7af513bea411c54dbf09a3a1af329b9a76", size = 3918023, upload-time = "2025-10-14T19:05:33.162Z" },
{ url = "https://files.pythonhosted.org/packages/95/0a/0b609acd45dbb83c04c7ecb8c7c789f5c15bbdd422129360bde093bc4a99/awscrt-0.28.2-cp311-abi3-win32.whl", hash = "sha256:3e94f63497b454d30892d7a7ce917a451c6f33590964d3a475d93f93b20083b6", size = 3917048, upload-time = "2025-10-14T19:05:34.745Z" },
{ url = "https://files.pythonhosted.org/packages/d1/38/bf33abd6d09c8572f8e09488db2b0a60124767d7f5d6d9a33cf8b051b7af/awscrt-0.28.2-cp311-abi3-win_amd64.whl", hash = "sha256:3e094772b1f6fd0f8c5f7cf37655d0984739f99493f66f534979a2a7bb7fc9f6", size = 4052877, upload-time = "2025-10-14T19:05:36.01Z" },
{ url = "https://files.pythonhosted.org/packages/10/71/4be198e472d95702434cee1f9dd889c56e22bea8554b466fad754148fd24/awscrt-0.28.2-cp313-abi3-macosx_10_15_universal2.whl", hash = "sha256:5fda9e7d0eb800491fadebe2b6c2560ac2f5742b60f4106440dca4b49da7fb03", size = 3379585, upload-time = "2025-10-14T19:05:37.225Z" },
{ url = "https://files.pythonhosted.org/packages/43/09/77084249d07dca71352341ad3fbcfa75deaccf25bd65f9fdbb36ce1f978b/awscrt-0.28.2-cp313-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:994a795bdc83344922a15891abb30155ec292093e856eef3929dd63dd6cadaca", size = 3779843, upload-time = "2025-10-14T19:05:38.774Z" },
{ url = "https://files.pythonhosted.org/packages/a6/bb/fcee9365e58e5860582398317571a9a5517da258cd81c3d987b9882f61d4/awscrt-0.28.2-cp313-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:28537c4517168927ef74aa007a2e0c9f436921227934d82da31e9a1cec7e0c4a", size = 4049154, upload-time = "2025-10-14T19:05:40.301Z" },
{ url = "https://files.pythonhosted.org/packages/ba/8e/ac92b2707dbe05e56d0dd5af73cb4e07a3da4aee66936071123966523759/awscrt-0.28.2-cp313-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:b9fc6be63832da3ff244d56c7d9a43326d89d79e68162419c35f33e6ad033be0", size = 3683672, upload-time = "2025-10-14T19:05:41.536Z" },
{ url = "https://files.pythonhosted.org/packages/ef/d0/15308ec37e762691f5d1871b0f1a6e462da8e421c6c38d6724e3cf0994b2/awscrt-0.28.2-cp313-abi3-musllinux_1_1_x86_64.whl", hash = "sha256:efb57103a368de1d33148cb70a382c4f82ac376c744de9484e0f621cef8313f3", size = 3912823, upload-time = "2025-10-14T19:05:43.781Z" },
{ url = "https://files.pythonhosted.org/packages/bc/cd/7693b1d72069908b7a3ee30e4ef2b5fc8f54948a96397729277cb0b0c7b4/awscrt-0.28.2-cp313-abi3-win32.whl", hash = "sha256:594dc61f4f0c1c9fb7292364d25c21810b3608cd67c0de78a032ad48f7bfd88c", size = 3911514, upload-time = "2025-10-14T19:05:45.019Z" },
{ url = "https://files.pythonhosted.org/packages/93/d6/5d8545c967690f03d55d44ed56ceff26d88363cd7d0435fd80a1c843ac2a/awscrt-0.28.2-cp313-abi3-win_amd64.whl", hash = "sha256:a17f0ab9dc5e5301da0fb00ccc4511a136d13abbd4a9564827547333fcd7ba16", size = 4047912, upload-time = "2025-10-14T19:05:46.302Z" },
]
[[package]]
@@ -569,6 +569,30 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/39/54/db7a801933dd2537f5376fb8a9e28caff488ef5c2d61f3a8fced55fe6336/blake3-1.0.7-cp313-cp313t-musllinux_1_1_x86_64.whl", hash = "sha256:d9046bb1e22a8607e1d0d7c3ff47e56e0a197c988502df4bf4d78563f3e9fe2c", size = 553411, upload-time = "2025-09-29T16:40:45.667Z" },
{ url = "https://files.pythonhosted.org/packages/2c/08/949cf68d16d1f731d502968bb1486e1a4bf7ef032c38fbc2ef26a2353494/blake3-1.0.7-cp313-cp313t-win32.whl", hash = "sha256:bd2f638bcc00fc09ce985ea3c642d45940e1eda198ab1f4b90cfdecbebbc9315", size = 227049, upload-time = "2025-09-29T16:40:47.446Z" },
{ url = "https://files.pythonhosted.org/packages/f2/ae/6783a5ca6235024e00a1e92ab6ca2cd855f4c61c763cf8d6d643846d110c/blake3-1.0.7-cp313-cp313t-win_amd64.whl", hash = "sha256:cb3aa1db14231c2ef0ec5acd805505ce128c39ffa510deb3384eed96fe4addcb", size = 214101, upload-time = "2025-09-29T16:40:48.656Z" },
{ url = "https://files.pythonhosted.org/packages/32/aa/99b4b6c22972b9a854f77d97846a717448a77d079e4bd38e46a3f8ecea76/blake3-1.0.7-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:f7db997205aa420d59fb5639346e40beafb9c09252e2ec6efedca8f230f7520c", size = 346664, upload-time = "2025-10-11T18:02:54.609Z" },
{ url = "https://files.pythonhosted.org/packages/f9/44/e98bc5450be415a335a191b154e299e335046d11fe9514d93961902b7aed/blake3-1.0.7-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:19afec6e276f3bc154541248d92b1ecb198af2ee920025f7ce521028f9a69d8b", size = 324576, upload-time = "2025-10-11T18:02:57.062Z" },
{ url = "https://files.pythonhosted.org/packages/74/25/23a39913c8424ac3df705ed71a00efe34cc1cdbd4588ed6eaf458ea9d7ef/blake3-1.0.7-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:006a11bbba65a95e88ddc069cca751c8812fd144d582715eeea512452fdbe80d", size = 370545, upload-time = "2025-10-11T18:02:59.824Z" },
{ url = "https://files.pythonhosted.org/packages/db/83/9f53a86de9a5999b043febfd84765d240014da42055aeac06d1005b20b07/blake3-1.0.7-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7febeffdc8412fed105ca517cee641ac521fb9cfb750bf7e27a5cdf3ddf74a08", size = 374370, upload-time = "2025-10-11T18:03:01.412Z" },
{ url = "https://files.pythonhosted.org/packages/c4/4c/3290aa4fb7483975a7b3322a73692aa3cf491a77ce7ac61c216c71c6f834/blake3-1.0.7-cp314-cp314-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6c032ce7c52b71015651c0abe9fe599aa2669e6be578aa17d5f993dc93373401", size = 447808, upload-time = "2025-10-11T18:03:02.893Z" },
{ url = "https://files.pythonhosted.org/packages/66/26/92b6e15552865416aae1aedad8b9b4d8b47ca9b73d25373622b1798c05a9/blake3-1.0.7-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5b81455f7d24b58fe26be037cc3854c28ea6eb3671ceab3b1ec0b1239aeb6fef", size = 506118, upload-time = "2025-10-11T18:03:04.51Z" },
{ url = "https://files.pythonhosted.org/packages/1b/ef/f158fc43a03fd366bc428a52a845bd0f884e518deda901c9216bd469867e/blake3-1.0.7-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:41b0127b0e7c8610054c421959dbe7140a81ac2c88fa9e099994fbaa529af3c1", size = 393239, upload-time = "2025-10-11T18:03:07.102Z" },
{ url = "https://files.pythonhosted.org/packages/10/49/2a56ce897ec7ed0e25953b3873da271ea60cc107ae02ecc6655252e554c7/blake3-1.0.7-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4755ca95b4114b629d8f3570bc661916d211d52d47f57ff70e9687377ab39cb9", size = 386267, upload-time = "2025-10-11T18:03:08.904Z" },
{ url = "https://files.pythonhosted.org/packages/d9/c4/ee4c03ea419198b91c889ef173015b5d637a390d3f7d63cb70033a7201d6/blake3-1.0.7-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:8abe929cfd27b375e02e3dd7a690192fa4efecc52ef510df91ef01651ef08dc7", size = 549641, upload-time = "2025-10-11T18:03:10.64Z" },
{ url = "https://files.pythonhosted.org/packages/b2/cc/a918d6649b56fe705133e06d9958d90978aad30063d42cca4dfe23db16e9/blake3-1.0.7-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:dd607eb5ad5a9b44ff62243759aa0af4085f6f43c9b01f503561a70da63e3b94", size = 553691, upload-time = "2025-10-11T18:03:12.108Z" },
{ url = "https://files.pythonhosted.org/packages/fd/9f/568546f555fd1555d4867c497e9413f67bf769d076e773b9ca9e07a0b6f6/blake3-1.0.7-cp314-cp314-win32.whl", hash = "sha256:a51684d1f346e7680f7c244c25b0e279e3b297f1938126e4ea8e32425ea269f5", size = 227552, upload-time = "2025-10-11T18:03:13.468Z" },
{ url = "https://files.pythonhosted.org/packages/97/2b/d4ef7365d9f601c8a127b5993f2662d45d2cb6d430bf3dbbb7a6f0b33639/blake3-1.0.7-cp314-cp314-win_amd64.whl", hash = "sha256:a6a481719e28e2c61aafd4273d32663365d97613341b72fcdf2f6afbd426319b", size = 214719, upload-time = "2025-10-11T18:03:14.835Z" },
{ url = "https://files.pythonhosted.org/packages/2f/53/f697cc34e382a225d163ea0c6a35c7eb4cfd1011e85db6610adfac98e522/blake3-1.0.7-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:daa8933cd7db19143bd6b59f7ac4c7c7446767d7b2c3a748a4559aa483275fa2", size = 347071, upload-time = "2025-10-11T18:03:16.637Z" },
{ url = "https://files.pythonhosted.org/packages/4c/85/836dcb5c5709c2331f02ce065f7ebfaae710a6c1768cdc47ee3197645f98/blake3-1.0.7-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:24074adfffffe0fa7a7dd930cc608d6e965e70306e2c1e14d412e29ec94fa360", size = 324341, upload-time = "2025-10-11T18:03:18.073Z" },
{ url = "https://files.pythonhosted.org/packages/6d/48/36b2c25007933619ce60e24b9f360baaa77d08939284045476c8e157fe62/blake3-1.0.7-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:dce6e6f03de2674f9860cf330d8a4fcdb63a60659435e5e31d72d174fc102d8e", size = 370140, upload-time = "2025-10-11T18:03:19.582Z" },
{ url = "https://files.pythonhosted.org/packages/70/82/8a8977e5d56b9fb719033940c8ce34afc733190d34ab868a647a9af7b584/blake3-1.0.7-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:e783f33d53a2de8d2ab845235dd53393d521b5e4a76c23d03e77e472266359d3", size = 373022, upload-time = "2025-10-11T18:03:21.143Z" },
{ url = "https://files.pythonhosted.org/packages/e2/c4/44017ba40804a528568b35a36c05187786830c4d891c5540d59a121a7cec/blake3-1.0.7-cp314-cp314t-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:782784aef18eb61f4ce8bf2b9506b7d90f0d183176b453345b221837a18041b7", size = 447243, upload-time = "2025-10-11T18:03:22.707Z" },
{ url = "https://files.pythonhosted.org/packages/78/c1/4fa20e68624784082734d31b8c9c80ad226658c024e61b9f9b6751ba0a4a/blake3-1.0.7-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6062122e77f40e3733cac2ef3f25e0fc7f555e352fe6f513f8404ad11dc69974", size = 506149, upload-time = "2025-10-11T18:03:24.424Z" },
{ url = "https://files.pythonhosted.org/packages/8e/63/af65466e27e7b92800a068afaee11b2fa071e34a7f5900f8e13832f18185/blake3-1.0.7-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:6c2614bc9d69fd6067571f3bb37b3b07a6b86a56167553ad4784a3c508771f39", size = 393243, upload-time = "2025-10-11T18:03:25.872Z" },
{ url = "https://files.pythonhosted.org/packages/f3/82/54a4807a3243d0e094ada9d65687aeb40059587e374b3beb9c89f6552c9b/blake3-1.0.7-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d6df2bd56c43bdeb6699d4af0a0dd0d77537d95cb4a5dde4b39ed6e54cc725d6", size = 386318, upload-time = "2025-10-11T18:03:27.338Z" },
{ url = "https://files.pythonhosted.org/packages/42/e8/32b56531b5d9da67e476735ceaec7c3bf89310629abeeafb03c724145c88/blake3-1.0.7-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:8b635cf4350caf459ecb335b32be622068423245bda457d5bc159106eb20f912", size = 548945, upload-time = "2025-10-11T18:03:28.779Z" },
{ url = "https://files.pythonhosted.org/packages/ad/50/33b1aca708be629e285a537f1adf34dfcabc4c30b28c436361323d11f593/blake3-1.0.7-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:f96a685775f87ddf75ff495dc9698703268c66c170caca977347427ef8d52324", size = 553564, upload-time = "2025-10-11T18:03:30.247Z" },
{ url = "https://files.pythonhosted.org/packages/fe/07/8b17cbf40ccd9afeed6ae9f55018181786b30ff4e079ac8bf4ca4799e47b/blake3-1.0.7-cp314-cp314t-win32.whl", hash = "sha256:0633b7d9bad87dc7fce545042353f2e056604d993f71d1dce666a9f5edc13e05", size = 227345, upload-time = "2025-10-11T18:03:31.933Z" },
{ url = "https://files.pythonhosted.org/packages/d9/8a/ab9de8a73616350759356a483f440212bc2a22fc9aaa77cabbf06c3483db/blake3-1.0.7-cp314-cp314t-win_amd64.whl", hash = "sha256:5e356daa0089968dc1ff1d0d112e7cc1700533441d8f30ae99f835a94dc8b0f3", size = 213964, upload-time = "2025-10-11T18:03:33.919Z" },
]
[[package]]
@@ -867,16 +891,16 @@ wheels = [
[[package]]
name = "compressed-tensors"
version = "0.10.2"
version = "0.10.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "pydantic" },
{ name = "torch" },
{ name = "transformers" },
]
sdist = { url = "https://files.pythonhosted.org/packages/c0/86/d43d369abc81ec63ec7b8f6f27fc8b113ea0fd18a4116ae12063387b8b34/compressed_tensors-0.10.2.tar.gz", hash = "sha256:6de13ac535d7ffdd8890fad3d229444c33076170acaa8fab6bab8ecfa96c1d8f", size = 173459, upload-time = "2025-06-23T13:19:06.135Z" }
sdist = { url = "https://files.pythonhosted.org/packages/40/eb/2229523a539e8074b238c225d168f734f6f056ab4ea2278eefe752f4a6f3/compressed_tensors-0.10.1.tar.gz", hash = "sha256:f99ce620ddcf8a657eaa7995daf5faa8e988d4b4cadc595bf2c4ff9346c2c19a", size = 126778, upload-time = "2025-06-06T18:25:16.538Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/43/ac/56bb4b6b3150783119479e2f05e32ebfc39ca6ff8e6fcd45eb178743b39e/compressed_tensors-0.10.2-py3-none-any.whl", hash = "sha256:e1b4d9bc2006e3fd3a938e59085f318fdb280c5af64688a4792bf1bc263e579d", size = 169030, upload-time = "2025-06-23T13:19:03.487Z" },
{ url = "https://files.pythonhosted.org/packages/5b/07/e70a0b9efc24a32740396c404e7213c62b8aeb4a577ed5a3f191f8d7806b/compressed_tensors-0.10.1-py3-none-any.whl", hash = "sha256:b8890735522c119900e8d4192cced0b0f70a98440ae070448cb699165c404659", size = 116998, upload-time = "2025-06-06T18:25:14.54Z" },
]
[[package]]
@@ -1258,13 +1282,13 @@ wheels = [
[[package]]
name = "daily-python"
version = "0.19.9"
version = "0.21.0"
source = { registry = "https://pypi.org/simple" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/22/85/6064c3225e5b190e522e8f3bc6a460efc5e3e6632f16fd5f9799c44ba57a/daily_python-0.19.9-cp37-abi3-macosx_10_15_x86_64.whl", hash = "sha256:cbc558ad7d49e79b550bf7567b9ceae75e2864d4fcaf41c90377b620e38a2461", size = 13365213, upload-time = "2025-09-06T00:31:00.224Z" },
{ url = "https://files.pythonhosted.org/packages/23/58/af986c6881180a46a7b60dd418ce58d6d7c0c4ffc48d261748067c679317/daily_python-0.19.9-cp37-abi3-macosx_11_0_arm64.whl", hash = "sha256:446bb9ee848d88bc68ca29a2216793c9b5ebaf5991bf604daf76f7c5a53d5919", size = 11711673, upload-time = "2025-09-06T00:31:02.526Z" },
{ url = "https://files.pythonhosted.org/packages/9d/48/1cad4c3e92cdb5ef06467d972c76a510fe5e807513334b10ad7f8c21bf74/daily_python-0.19.9-cp37-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:2facaf82b614404c642c70bbf0874fb045d8ad46400acb051470cd4df93cb4db", size = 13679393, upload-time = "2025-09-06T00:31:04.999Z" },
{ url = "https://files.pythonhosted.org/packages/3c/e9/354f4699619e83d13e266256b2352b21741ac527e3e5ab5f2264d5c482cd/daily_python-0.19.9-cp37-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:ffc205efca7b47739efd358febab17577248c8db2ebc4d17d819307a83b9eefc", size = 14221932, upload-time = "2025-09-06T00:31:07.471Z" },
{ url = "https://files.pythonhosted.org/packages/ff/11/99590f8b7aad077f3f9b5b59d39b010aee0bd01b14dece8ae1e93d8080e7/daily_python-0.21.0-cp37-abi3-macosx_10_15_x86_64.whl", hash = "sha256:bdec96417825181559769bb2258ae688d1215949a1878336194e36fb452274a8", size = 13277066, upload-time = "2025-10-29T00:20:49.523Z" },
{ url = "https://files.pythonhosted.org/packages/e5/db/8c57f1a1b713ba3393584ac2be32d8074d3022a2c2c17c28eb4cd2aa3629/daily_python-0.21.0-cp37-abi3-macosx_11_0_arm64.whl", hash = "sha256:18677fa1415a0dc48b891cdf2fb8fe9dabc70e1b019d5aaa3d0699ccc8d187c9", size = 11644908, upload-time = "2025-10-29T00:20:52.106Z" },
{ url = "https://files.pythonhosted.org/packages/64/b6/b03f2f58a367d6ef4bb728715471542fdfa68afa8a177670139c3a2aadb7/daily_python-0.21.0-cp37-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:97eb97352fe74227061b678e330b8befcfa4c694feb6eb2b09fe6eacec00ad6d", size = 13652356, upload-time = "2025-10-29T00:20:54.813Z" },
{ url = "https://files.pythonhosted.org/packages/f6/76/bde65f6f8d4c1679dc6c185fa37dae9223f6ddb4b7ced728ef46504956f7/daily_python-0.21.0-cp37-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:68c3e36f609fc2fce79e4d17ecf1021eadd836506db6c5125f95c682bcf3612a", size = 14304643, upload-time = "2025-10-29T00:20:57.194Z" },
]
[[package]]
@@ -2726,7 +2750,7 @@ wheels = [
[[package]]
name = "langchain-core"
version = "0.3.77"
version = "0.3.79"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "jsonpatch" },
@@ -2737,23 +2761,23 @@ dependencies = [
{ name = "tenacity" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/40/cc/786184e5f6a921a2aa4d2ac51d3adf0cd037289f3becff39644bee9654ee/langchain_core-0.3.77.tar.gz", hash = "sha256:1d6f2ad6bb98dd806c6c66a822fa93808d821e9f0348b28af0814b3a149830e7", size = 580255, upload-time = "2025-10-01T14:34:37.368Z" }
sdist = { url = "https://files.pythonhosted.org/packages/c8/99/f926495f467e0f43289f12e951655d267d1eddc1136c3cf4dd907794a9a7/langchain_core-0.3.79.tar.gz", hash = "sha256:024ba54a346dd9b13fb8b2342e0c83d0111e7f26fa01f545ada23ad772b55a60", size = 580895, upload-time = "2025-10-09T21:59:08.359Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/64/18/e7462ae0ce57caa9f6d5d975dca861e9a751e5ca253d60a809e0d833eac3/langchain_core-0.3.77-py3-none-any.whl", hash = "sha256:9966dfe3d8365847c5fb85f97dd20e3e21b1904ae87cfd9d362b7196fb516637", size = 449525, upload-time = "2025-10-01T14:34:35.672Z" },
{ url = "https://files.pythonhosted.org/packages/fc/71/46b0efaf3fc6ad2c2bd600aef500f1cb2b7038a4042f58905805630dd29d/langchain_core-0.3.79-py3-none-any.whl", hash = "sha256:92045bfda3e741f8018e1356f83be203ec601561c6a7becfefe85be5ddc58fdb", size = 449779, upload-time = "2025-10-09T21:59:06.493Z" },
]
[[package]]
name = "langchain-openai"
version = "0.3.23"
version = "0.3.29"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "langchain-core" },
{ name = "openai" },
{ name = "tiktoken" },
]
sdist = { url = "https://files.pythonhosted.org/packages/74/f1/575120e829430f9bdcfc2c5c4121f04b1b5a143d96e572ff32399b787ef2/langchain_openai-0.3.23.tar.gz", hash = "sha256:73411c06e04bc145db7146a6fcf33dd0f1a85130499dcae988829a4441ddaa66", size = 647923, upload-time = "2025-06-13T14:24:31.388Z" }
sdist = { url = "https://files.pythonhosted.org/packages/5b/56/2e2010d15118ac52760f92ebf6ce75b3508e7a1023107ea04233fd6263e0/langchain_openai-0.3.29.tar.gz", hash = "sha256:83a0455f8ce874aa1806131ca3b4db08e482be037b7457a9b3ca21a213d2ab47", size = 766499, upload-time = "2025-08-08T15:12:32.402Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/71/65/88060305d5d627841bc8da7e9fb31fb603e5b103b4e5ec5b4d1a7edfbc3b/langchain_openai-0.3.23-py3-none-any.whl", hash = "sha256:624794394482c0923823f0aac44979968d77fdcfa810e42d4b0abd8096199a40", size = 65392, upload-time = "2025-06-13T14:24:30.263Z" },
{ url = "https://files.pythonhosted.org/packages/ac/f2/a6a73beec15e90605e6a24c4498a8592d79a72c8e81c18ed0f5e9b7308e9/langchain_openai-0.3.29-py3-none-any.whl", hash = "sha256:71ae6791b3e017ec892a8062f993edc882c6665fd8385aa66e9dc3bff8205996", size = 74316, upload-time = "2025-08-08T15:12:30.794Z" },
]
[[package]]
@@ -3209,23 +3233,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/4b/b4/b61eeb92c424947675492dec3a411bdbeae307dfd78162d65ab47e8c3b4f/mlx-0.29.2-cp313-cp313-manylinux_2_35_x86_64.whl", hash = "sha256:c3b9a9aee13f346d060966472954eebe99d9f1b295c9a237c9a000f1ef9adf2c", size = 648709, upload-time = "2025-09-26T22:26:03.452Z" },
]
[[package]]
name = "mlx-lm"
version = "0.28.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "jinja2" },
{ name = "mlx" },
{ name = "numpy" },
{ name = "protobuf" },
{ name = "pyyaml" },
{ name = "transformers" },
]
sdist = { url = "https://files.pythonhosted.org/packages/1c/d7/fdde445c7bd443a2ed23badda6064f1477c4051543922106f365e94082cd/mlx_lm-0.28.2.tar.gz", hash = "sha256:d28752635ed5c89ff2b41361916c928e6b16f765c07b2908044e1dcaf921ed9b", size = 209374, upload-time = "2025-10-02T14:23:57.497Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/f2/1c/89e0f60d45e364de8507065f73aeb8d2fd810d6cb95a9a512880b09399d5/mlx_lm-0.28.2-py3-none-any.whl", hash = "sha256:1501529e625d0d648216f7bb543b8b449d5fd17bd598f635536dbc1fbde6d1d6", size = 284600, upload-time = "2025-10-02T14:23:56.395Z" },
]
[[package]]
name = "mlx-metal"
version = "0.29.2"
@@ -3851,7 +3858,7 @@ wheels = [
[[package]]
name = "openai"
version = "1.74.0"
version = "1.97.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio" },
@@ -3863,9 +3870,9 @@ dependencies = [
{ name = "tqdm" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/75/86/c605a6e84da0248f2cebfcd864b5a6076ecf78849245af5e11d2a5ec7977/openai-1.74.0.tar.gz", hash = "sha256:592c25b8747a7cad33a841958f5eb859a785caea9ee22b9e4f4a2ec062236526", size = 427571, upload-time = "2025-04-14T16:45:25.062Z" }
sdist = { url = "https://files.pythonhosted.org/packages/a6/57/1c471f6b3efb879d26686d31582997615e969f3bb4458111c9705e56332e/openai-1.97.1.tar.gz", hash = "sha256:a744b27ae624e3d4135225da9b1c89c107a2a7e5bc4c93e5b7b5214772ce7a4e", size = 494267, upload-time = "2025-07-22T13:10:12.607Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/a9/91/8c150f16a96367e14bd7d20e86e0bbbec3080e3eb593e63f21a7f013f8e4/openai-1.74.0-py3-none-any.whl", hash = "sha256:aff3e0f9fb209836382ec112778667027f4fd6ae38bdb2334bc9e173598b092a", size = 644790, upload-time = "2025-04-14T16:45:23.041Z" },
{ url = "https://files.pythonhosted.org/packages/ee/35/412a0e9c3f0d37c94ed764b8ac7adae2d834dbd20e69f6aca582118e0f55/openai-1.97.1-py3-none-any.whl", hash = "sha256:4e96bbdf672ec3d44968c9ea39d2c375891db1acc1794668d8149d5fa6000606", size = 764380, upload-time = "2025-07-22T13:10:10.689Z" },
]
[[package]]
@@ -3904,7 +3911,7 @@ wheels = [
[[package]]
name = "openpipe"
version = "4.50.0"
version = "5.0.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anthropic" },
@@ -3913,9 +3920,9 @@ dependencies = [
{ name = "openai" },
{ name = "python-dateutil" },
]
sdist = { url = "https://files.pythonhosted.org/packages/ec/0b/5ac4afd2253e058463fe46b44ebdf9cf153af343b457f13e9e592943c16d/openpipe-4.50.0.tar.gz", hash = "sha256:a2b1bf7a30a8d4c2cf45b85c749839ea9811e36f9d03916df8ffa343d9193a0e", size = 98954, upload-time = "2025-04-15T18:13:36.935Z" }
sdist = { url = "https://files.pythonhosted.org/packages/7c/34/b487bc0ff60d3ed634e6f7bc34b5138f04e6ae319cc6578001822df93901/openpipe-5.0.0.tar.gz", hash = "sha256:040acc526fece42ba505fcedd8cd584f42482c9bd01f16b2538c9ea9c82882f4", size = 98910, upload-time = "2025-07-31T01:36:29.482Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/92/39/04870a3157d4ad6e8b1671f584da3e064750ccd64aa08339c6fc6dbd3a1c/openpipe-4.50.0-py3-none-any.whl", hash = "sha256:2071c3edbba3e08ceb977ad8c12d407f4da86c0c3815447fa33674d918276e5e", size = 440892, upload-time = "2025-04-15T18:13:35.258Z" },
{ url = "https://files.pythonhosted.org/packages/7a/5e/516010c25a32884a87e1f8303a292f3981fa382cc7570a9ed88fb28681d5/openpipe-5.0.0-py3-none-any.whl", hash = "sha256:c04af7afb4d9bcd52e1250757dd93d0e0ed19c9ff4b524f131dd94aadf4c1a9b", size = 439951, upload-time = "2025-07-31T01:36:28.003Z" },
]
[[package]]
@@ -3931,6 +3938,67 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/91/48/28ed9e55dcf2f453128df738210a980e09f4e468a456fa3c763dbc8be70a/opentelemetry_api-1.37.0-py3-none-any.whl", hash = "sha256:accf2024d3e89faec14302213bc39550ec0f4095d1cf5ca688e1bfb1c8612f47", size = 65732, upload-time = "2025-09-11T10:28:41.826Z" },
]
[[package]]
name = "opentelemetry-exporter-otlp"
version = "1.37.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "opentelemetry-exporter-otlp-proto-grpc" },
{ name = "opentelemetry-exporter-otlp-proto-http" },
]
sdist = { url = "https://files.pythonhosted.org/packages/64/df/47fde1de15a3d5ad410e98710fac60cd3d509df5dc7ec1359b71d6bf7e70/opentelemetry_exporter_otlp-1.37.0.tar.gz", hash = "sha256:f85b1929dd0d750751cc9159376fb05aa88bb7a08b6cdbf84edb0054d93e9f26", size = 6145, upload-time = "2025-09-11T10:29:03.075Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/f5/23/7e35e41111e3834d918e414eca41555d585e8860c9149507298bb3b9b061/opentelemetry_exporter_otlp-1.37.0-py3-none-any.whl", hash = "sha256:bd44592c6bc7fc3e5c0a9b60f2ee813c84c2800c449e59504ab93f356cc450fc", size = 7019, upload-time = "2025-09-11T10:28:44.094Z" },
]
[[package]]
name = "opentelemetry-exporter-otlp-proto-common"
version = "1.37.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "opentelemetry-proto" },
]
sdist = { url = "https://files.pythonhosted.org/packages/dc/6c/10018cbcc1e6fff23aac67d7fd977c3d692dbe5f9ef9bb4db5c1268726cc/opentelemetry_exporter_otlp_proto_common-1.37.0.tar.gz", hash = "sha256:c87a1bdd9f41fdc408d9cc9367bb53f8d2602829659f2b90be9f9d79d0bfe62c", size = 20430, upload-time = "2025-09-11T10:29:03.605Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/08/13/b4ef09837409a777f3c0af2a5b4ba9b7af34872bc43609dda0c209e4060d/opentelemetry_exporter_otlp_proto_common-1.37.0-py3-none-any.whl", hash = "sha256:53038428449c559b0c564b8d718df3314da387109c4d36bd1b94c9a641b0292e", size = 18359, upload-time = "2025-09-11T10:28:44.939Z" },
]
[[package]]
name = "opentelemetry-exporter-otlp-proto-grpc"
version = "1.37.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "googleapis-common-protos" },
{ name = "grpcio" },
{ name = "opentelemetry-api" },
{ name = "opentelemetry-exporter-otlp-proto-common" },
{ name = "opentelemetry-proto" },
{ name = "opentelemetry-sdk" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/d1/11/4ad0979d0bb13ae5a845214e97c8d42da43980034c30d6f72d8e0ebe580e/opentelemetry_exporter_otlp_proto_grpc-1.37.0.tar.gz", hash = "sha256:f55bcb9fc848ce05ad3dd954058bc7b126624d22c4d9e958da24d8537763bec5", size = 24465, upload-time = "2025-09-11T10:29:04.172Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/39/17/46630b74751031a658706bef23ac99cdc2953cd3b2d28ec90590a0766b3e/opentelemetry_exporter_otlp_proto_grpc-1.37.0-py3-none-any.whl", hash = "sha256:aee5104835bf7993b7ddaaf380b6467472abaedb1f1dbfcc54a52a7d781a3890", size = 19305, upload-time = "2025-09-11T10:28:45.776Z" },
]
[[package]]
name = "opentelemetry-exporter-otlp-proto-http"
version = "1.37.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "googleapis-common-protos" },
{ name = "opentelemetry-api" },
{ name = "opentelemetry-exporter-otlp-proto-common" },
{ name = "opentelemetry-proto" },
{ name = "opentelemetry-sdk" },
{ name = "requests" },
{ name = "typing-extensions" },
]
sdist = { url = "https://files.pythonhosted.org/packages/5d/e3/6e320aeb24f951449e73867e53c55542bebbaf24faeee7623ef677d66736/opentelemetry_exporter_otlp_proto_http-1.37.0.tar.gz", hash = "sha256:e52e8600f1720d6de298419a802108a8f5afa63c96809ff83becb03f874e44ac", size = 17281, upload-time = "2025-09-11T10:29:04.844Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e9/e9/70d74a664d83976556cec395d6bfedd9b85ec1498b778367d5f93e373397/opentelemetry_exporter_otlp_proto_http-1.37.0-py3-none-any.whl", hash = "sha256:54c42b39945a6cc9d9a2a33decb876eabb9547e0dcb49df090122773447f1aef", size = 19576, upload-time = "2025-09-11T10:28:46.726Z" },
]
[[package]]
name = "opentelemetry-instrumentation"
version = "0.58b0"
@@ -3960,6 +4028,18 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/a5/54/add1076cb37980e617723a96e29c84006983e8ad6fc589dde7f69ddc57d4/opentelemetry_instrumentation_threading-0.58b0-py3-none-any.whl", hash = "sha256:eacc072881006aceb5b9b6831bcdce718c67ef6f31ac0b32bd6a23a94d979b4a", size = 9312, upload-time = "2025-09-11T11:41:58.603Z" },
]
[[package]]
name = "opentelemetry-proto"
version = "1.37.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "protobuf" },
]
sdist = { url = "https://files.pythonhosted.org/packages/dd/ea/a75f36b463a36f3c5a10c0b5292c58b31dbdde74f6f905d3d0ab2313987b/opentelemetry_proto-1.37.0.tar.gz", hash = "sha256:30f5c494faf66f77faeaefa35ed4443c5edb3b0aa46dad073ed7210e1a789538", size = 46151, upload-time = "2025-09-11T10:29:11.04Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/c4/25/f89ea66c59bd7687e218361826c969443c4fa15dfe89733f3bf1e2a9e971/opentelemetry_proto-1.37.0-py3-none-any.whl", hash = "sha256:8ed8c066ae8828bbf0c39229979bdf583a126981142378a9cbe9d6fd5701c6e2", size = 72534, upload-time = "2025-09-11T10:28:56.831Z" },
]
[[package]]
name = "opentelemetry-sdk"
version = "1.37.0"
@@ -3987,6 +4067,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/07/90/68152b7465f50285d3ce2481b3aec2f82822e3f52e5152eeeaf516bab841/opentelemetry_semantic_conventions-0.58b0-py3-none-any.whl", hash = "sha256:5564905ab1458b96684db1340232729fce3b5375a06e140e8904c78e4f815b28", size = 207954, upload-time = "2025-09-11T10:28:59.218Z" },
]
[[package]]
name = "opentelemetry-semantic-conventions-ai"
version = "0.4.13"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/ba/e6/40b59eda51ac47009fb47afcdf37c6938594a0bd7f3b9fadcbc6058248e3/opentelemetry_semantic_conventions_ai-0.4.13.tar.gz", hash = "sha256:94efa9fb4ffac18c45f54a3a338ffeb7eedb7e1bb4d147786e77202e159f0036", size = 5368, upload-time = "2025-08-22T10:14:17.387Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/35/b5/cf25da2218910f0d6cdf7f876a06bed118c4969eacaf60a887cbaef44f44/opentelemetry_semantic_conventions_ai-0.4.13-py3-none-any.whl", hash = "sha256:883a30a6bb5deaec0d646912b5f9f6dcbb9f6f72557b73d0f2560bf25d13e2d5", size = 6080, upload-time = "2025-08-22T10:14:16.477Z" },
]
[[package]]
name = "orjson"
version = "3.11.3"
@@ -4544,11 +4633,11 @@ requires-dist = [
{ name = "aiortc", marker = "extra == 'webrtc'", specifier = ">=1.13.0,<2" },
{ name = "anthropic", marker = "extra == 'anthropic'", specifier = "~=0.49.0" },
{ name = "audioop-lts", marker = "python_full_version >= '3.13'", specifier = "~=0.2.1" },
{ name = "aws-sdk-bedrock-runtime", marker = "python_full_version >= '3.12' and extra == 'aws-nova-sonic'", specifier = "~=0.1.0" },
{ name = "aws-sdk-bedrock-runtime", marker = "python_full_version >= '3.12' and extra == 'aws-nova-sonic'", specifier = "~=0.1.1" },
{ name = "azure-cognitiveservices-speech", marker = "extra == 'azure'", specifier = "~=1.42.0" },
{ name = "cartesia", marker = "extra == 'cartesia'", specifier = "~=2.0.3" },
{ name = "coremltools", marker = "extra == 'local-smart-turn'", specifier = ">=8.0" },
{ name = "daily-python", marker = "extra == 'daily'", specifier = "~=0.19.9" },
{ name = "daily-python", marker = "extra == 'daily'", specifier = "~=0.21.0" },
{ name = "deepgram-sdk", marker = "extra == 'deepgram'", specifier = "~=4.7.0" },
{ name = "docstring-parser", specifier = "~=0.16" },
{ name = "einops", marker = "extra == 'moondream'", specifier = "~=0.8.0" },
@@ -4579,9 +4668,9 @@ requires-dist = [
{ name = "nvidia-riva-client", marker = "extra == 'riva'", specifier = "~=2.21.1" },
{ name = "onnxruntime", marker = "extra == 'local-smart-turn-v3'", specifier = ">=1.20.1,<2" },
{ name = "onnxruntime", marker = "extra == 'silero'", specifier = ">=1.20.1,<2" },
{ name = "openai", specifier = ">=1.74.0,<=1.99.1" },
{ name = "openai", specifier = ">=1.74.0,<3" },
{ name = "opencv-python", marker = "extra == 'webrtc'", specifier = ">=4.11.0.86,<5" },
{ name = "openpipe", marker = "extra == 'openpipe'", specifier = "~=4.50.0" },
{ name = "openpipe", marker = "extra == 'openpipe'", specifier = ">=4.50.0,<6" },
{ name = "opentelemetry-api", marker = "extra == 'tracing'", specifier = ">=1.33.0" },
{ name = "opentelemetry-instrumentation", marker = "extra == 'tracing'", specifier = ">=0.54b0" },
{ name = "opentelemetry-sdk", marker = "extra == 'tracing'", specifier = ">=1.33.0" },
@@ -4619,7 +4708,7 @@ requires-dist = [
{ name = "simli-ai", marker = "extra == 'simli'", specifier = "~=0.1.10" },
{ name = "soundfile", marker = "extra == 'soundfile'", specifier = "~=0.13.0" },
{ name = "soxr", specifier = "~=0.5.0" },
{ name = "speechmatics-rt", marker = "extra == 'speechmatics'", specifier = ">=0.4.0" },
{ name = "speechmatics-rt", marker = "extra == 'speechmatics'", specifier = ">=0.5.0" },
{ name = "strands-agents", marker = "extra == 'strands'", specifier = ">=1.9.1,<2" },
{ name = "tenacity", marker = "extra == 'livekit'", specifier = ">=8.2.3,<10.0.0" },
{ name = "timm", marker = "extra == 'moondream'", specifier = "~=1.0.13" },
@@ -4961,172 +5050,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/a5/8b/7f9a061c1cc2b230f9ac02a6003fcd14c85ce1828013aecbaf45aa988d20/PyAudio-0.2.14-cp313-cp313-win_amd64.whl", hash = "sha256:692d8c1446f52ed2662120bcd9ddcb5aa2b71f38bda31e58b19fb4672fffba69", size = 173655, upload-time = "2024-11-20T19:12:13.616Z" },
]
[[package]]
name = "pybase64"
version = "1.4.2"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/04/14/43297a7b7f0c1bf0c00b596f754ee3ac946128c64d21047ccf9c9bbc5165/pybase64-1.4.2.tar.gz", hash = "sha256:46cdefd283ed9643315d952fe44de80dc9b9a811ce6e3ec97fd1827af97692d0", size = 137246, upload-time = "2025-07-27T13:08:57.808Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/f3/6d/0a7159c24ed35c8b9190b148376ad9b96598354f94ede29df74861da9ec6/pybase64-1.4.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:82b4593b480773b17698fef33c68bae0e1c474ba07663fad74249370c46b46c9", size = 38240, upload-time = "2025-07-27T13:02:17.876Z" },
{ url = "https://files.pythonhosted.org/packages/86/2e/dad4cd832a90a49d98867e824180585e7c928504987d37304bccae11a314/pybase64-1.4.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:a126f29d29cb4a498db179135dbf955442a0de5b00f374523f5dcceb9074ff58", size = 31658, upload-time = "2025-07-27T13:02:20.823Z" },
{ url = "https://files.pythonhosted.org/packages/1d/d8/30ea35dc2c8c568be93e1379efcaa35092e37efa2ce7f1985ccc63babee7/pybase64-1.4.2-cp310-cp310-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:1eef93c29cc5567480d168f9cc1ebd3fc3107c65787aed2019a8ea68575a33e0", size = 65963, upload-time = "2025-07-27T13:02:22.376Z" },
{ url = "https://files.pythonhosted.org/packages/f6/da/1c22f2a21d6bb9ec2a214d15ae02d5b20a95335de218a0ecbf769c535a5c/pybase64-1.4.2-cp310-cp310-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:17b871a34aaeb0644145cb6bf28feb163f593abea11aec3dbcc34a006edfc828", size = 68887, upload-time = "2025-07-27T13:02:23.606Z" },
{ url = "https://files.pythonhosted.org/packages/ac/8d/e04d489ba99b444ce94b4d5b232365d00b0f0e8564275d7ba7434dcabe72/pybase64-1.4.2-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:1f734e16293637a35d282ce594eb05a7a90ea3ae2bc84a3496a5df9e6b890725", size = 57503, upload-time = "2025-07-27T13:02:24.83Z" },
{ url = "https://files.pythonhosted.org/packages/7e/b8/5ec9c334f30cf898709a084d596bf4b47aec2e07870f07bac5cf39754eca/pybase64-1.4.2-cp310-cp310-manylinux2014_armv7l.manylinux_2_17_armv7l.whl", hash = "sha256:22bd38db2d990d5545dde83511edeec366630d00679dbd945472315c09041dc6", size = 54517, upload-time = "2025-07-27T13:02:26.006Z" },
{ url = "https://files.pythonhosted.org/packages/b9/5a/6e4424ecca041e53aa7c14525f99edd43d0117c23c5d9cb14e931458a536/pybase64-1.4.2-cp310-cp310-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:dc65cee686dda72007b7541b2014f33ee282459c781b9b61305bd8b9cfadc8e1", size = 57167, upload-time = "2025-07-27T13:02:27.47Z" },
{ url = "https://files.pythonhosted.org/packages/5f/d0/13f1a9467cf565eecc21dce89fb0723458d8c563d2ccfb99b96e8318dfd5/pybase64-1.4.2-cp310-cp310-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:1e79641c420a22e49c67c046895efad05bf5f8b1dbe0dd78b4af3ab3f2923fe2", size = 57718, upload-time = "2025-07-27T13:02:28.631Z" },
{ url = "https://files.pythonhosted.org/packages/3e/34/d80335c36ad9400b18b4f92e9f680cf7646102fe4919f7bce5786a2ccb7b/pybase64-1.4.2-cp310-cp310-manylinux_2_31_riscv64.whl", hash = "sha256:12f5e7db522ef780a8b333dab5f7d750d270b23a1684bc2235ba50756c7ba428", size = 53021, upload-time = "2025-07-27T13:02:29.823Z" },
{ url = "https://files.pythonhosted.org/packages/68/57/504ff75f7c78df28be126fe6634083d28d7f84c17e04a74a7dcb50ab2377/pybase64-1.4.2-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:a618b1e1a63e75dd40c2a397d875935ed0835464dc55cb1b91e8f880113d0444", size = 56306, upload-time = "2025-07-27T13:02:31.314Z" },
{ url = "https://files.pythonhosted.org/packages/bf/bc/2d21cda8b73c8c9f5cd3d7e6e26dd6dfc96491052112f282332a3d5bf1d9/pybase64-1.4.2-cp310-cp310-musllinux_1_2_armv7l.whl", hash = "sha256:89b0a51702c7746fa914e75e680ad697b979cdead6b418603f56a6fc9de2f50f", size = 50101, upload-time = "2025-07-27T13:02:32.662Z" },
{ url = "https://files.pythonhosted.org/packages/88/6d/51942e7737bb0711ca3e55db53924fd7f07166d79da5508ab8f5fd5972a8/pybase64-1.4.2-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:c5161b8b82f8ba5dbbc3f76e0270622a2c2fdb9ffaf092d8f774ad7ec468c027", size = 66555, upload-time = "2025-07-27T13:02:34.122Z" },
{ url = "https://files.pythonhosted.org/packages/b6/c8/c46024d196402e7be4d3fad85336863a34816c3436c51fcf9c7c0781bf11/pybase64-1.4.2-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:2168de920c9b1e57850e9ff681852923a953601f73cc96a0742a42236695c316", size = 55684, upload-time = "2025-07-27T13:02:35.427Z" },
{ url = "https://files.pythonhosted.org/packages/6a/c5/953782c9d599ff5217ee87f19e317c494cd4840afcab4c48f99cb78ca201/pybase64-1.4.2-cp310-cp310-musllinux_1_2_riscv64.whl", hash = "sha256:7a1e3dc977562abe40ab43483223013be71b215a5d5f3c78a666e70a5076eeec", size = 52475, upload-time = "2025-07-27T13:02:36.634Z" },
{ url = "https://files.pythonhosted.org/packages/05/fb/57d36173631aab67ca4558cdbde1047fc67a09b77f9c53addd57c7e9fdd4/pybase64-1.4.2-cp310-cp310-musllinux_1_2_s390x.whl", hash = "sha256:4cf1e8a57449e48137ef4de00a005e24c3f1cffc0aafc488e36ceb5bb2cbb1da", size = 53943, upload-time = "2025-07-27T13:02:37.777Z" },
{ url = "https://files.pythonhosted.org/packages/75/73/23e5bb0bffac0cabe2d11d1c618f6ef73da9f430da03c5249931e3c49b63/pybase64-1.4.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:d8e1a381ba124f26a93d5925efbf6e6c36287fc2c93d74958e8b677c30a53fc0", size = 68411, upload-time = "2025-07-27T13:02:39.302Z" },
{ url = "https://files.pythonhosted.org/packages/ce/e7/0d5c99e5e61ff5e46949a0128b49fc2c47afc0d2b815333459b17aa9d467/pybase64-1.4.2-cp310-cp310-win32.whl", hash = "sha256:8fdd9c5b60ec9a1db854f5f96bba46b80a9520069282dc1d37ff433eb8248b1f", size = 33614, upload-time = "2025-07-27T13:02:40.478Z" },
{ url = "https://files.pythonhosted.org/packages/23/40/879b6de61d7c07a2cbf76b75e9739c4938c3a1f66ac03243f2ff7ec9fb6b/pybase64-1.4.2-cp310-cp310-win_amd64.whl", hash = "sha256:37a6c73f14c6539c0ad1aebf0cce92138af25c99a6e7aee637d9f9fc634c8a40", size = 35790, upload-time = "2025-07-27T13:02:41.864Z" },
{ url = "https://files.pythonhosted.org/packages/d2/e2/75cec12880ce3f47a79a2b9a0cdc766dc0429a7ce967bb3ab3a4b55a7f6b/pybase64-1.4.2-cp310-cp310-win_arm64.whl", hash = "sha256:b3280d03b7b361622c469d005cc270d763d9e29d0a490c26addb4f82dfe71a79", size = 30900, upload-time = "2025-07-27T13:02:43.022Z" },
{ url = "https://files.pythonhosted.org/packages/da/fb/edaa56bbf04715efc3c36966cc0150e01d7a8336c3da182f850b7fd43d32/pybase64-1.4.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:26284ef64f142067293347bcc9d501d2b5d44b92eab9d941cb10a085fb01c666", size = 38238, upload-time = "2025-07-27T13:02:44.224Z" },
{ url = "https://files.pythonhosted.org/packages/28/a4/ca1538e9adf08f5016b3543b0060c18aea9a6e805dd20712a197c509d90d/pybase64-1.4.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:52dd32fe5cbfd8af8f3f034a4a65ee61948c72e5c358bf69d59543fc0dbcf950", size = 31659, upload-time = "2025-07-27T13:02:45.445Z" },
{ url = "https://files.pythonhosted.org/packages/0b/8f/f9b49926a60848ba98350dd648227ec524fb78340b47a450c4dbaf24b1bb/pybase64-1.4.2-cp311-cp311-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:37f133e8c96427995480bb6d396d9d49e949a3e829591845bb6a5a7f215ca177", size = 68318, upload-time = "2025-07-27T13:02:46.644Z" },
{ url = "https://files.pythonhosted.org/packages/29/9b/6ed2dd2bc8007f33b8316d6366b0901acbdd5665b419c2893b3dd48708de/pybase64-1.4.2-cp311-cp311-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:a6ee3874b0abbdd4c903d3989682a3f016fd84188622879f6f95a5dc5718d7e5", size = 71357, upload-time = "2025-07-27T13:02:47.937Z" },
{ url = "https://files.pythonhosted.org/packages/fb/69/be9ac8127da8d8339db7129683bd2975cecb0bf40a82731e1a492577a177/pybase64-1.4.2-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5c69f177b1e404b22b05802127d6979acf4cb57f953c7de9472410f9c3fdece7", size = 59817, upload-time = "2025-07-27T13:02:49.163Z" },
{ url = "https://files.pythonhosted.org/packages/f4/a2/e3e09e000b509609276ee28b71beb0b61462d4a43b3e0db0a44c8652880c/pybase64-1.4.2-cp311-cp311-manylinux2014_armv7l.manylinux_2_17_armv7l.whl", hash = "sha256:80c817e88ef2ca3cc9a285fde267690a1cb821ce0da4848c921c16f0fec56fda", size = 56639, upload-time = "2025-07-27T13:02:50.384Z" },
{ url = "https://files.pythonhosted.org/packages/01/70/ad7eff88aa4f1be06db705812e1f01749606933bf8fe9df553bb04b703e6/pybase64-1.4.2-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:7a4bb6e7e45bfdaea0f2aaf022fc9a013abe6e46ccea31914a77e10f44098688", size = 59368, upload-time = "2025-07-27T13:02:51.883Z" },
{ url = "https://files.pythonhosted.org/packages/9d/82/0cd1b4bcd2a4da7805cfa04587be783bf9583b34ac16cadc29cf119a4fa2/pybase64-1.4.2-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:2710a80d41a2b41293cb0e5b84b5464f54aa3f28f7c43de88784d2d9702b8a1c", size = 59981, upload-time = "2025-07-27T13:02:53.16Z" },
{ url = "https://files.pythonhosted.org/packages/3c/4c/8029a03468307dfaf0f9694d31830487ee43af5f8a73407004907724e8ac/pybase64-1.4.2-cp311-cp311-manylinux_2_31_riscv64.whl", hash = "sha256:aa6122c8a81f6597e1c1116511f03ed42cf377c2100fe7debaae7ca62521095a", size = 54908, upload-time = "2025-07-27T13:02:54.363Z" },
{ url = "https://files.pythonhosted.org/packages/a1/8b/70bd0fe659e242efd0f60895a8ce1fe88e3a4084fd1be368974c561138c9/pybase64-1.4.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:b7e22b02505d64db308e9feeb6cb52f1d554ede5983de0befa59ac2d2ffb6a5f", size = 58650, upload-time = "2025-07-27T13:02:55.905Z" },
{ url = "https://files.pythonhosted.org/packages/64/ca/9c1d23cbc4b9beac43386a32ad53903c816063cef3f14c10d7c3d6d49a23/pybase64-1.4.2-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:edfe4a3c8c4007f09591f49b46a89d287ef5e8cd6630339536fe98ff077263c2", size = 52323, upload-time = "2025-07-27T13:02:57.192Z" },
{ url = "https://files.pythonhosted.org/packages/aa/29/a6292e9047248c8616dc53131a49da6c97a61616f80e1e36c73d7ef895fe/pybase64-1.4.2-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:b79b4a53dd117ffbd03e96953f2e6bd2827bfe11afeb717ea16d9b0893603077", size = 68979, upload-time = "2025-07-27T13:02:58.594Z" },
{ url = "https://files.pythonhosted.org/packages/c2/e0/cfec7b948e170395d8e88066e01f50e71195db9837151db10c14965d6222/pybase64-1.4.2-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:fd9afa7a61d89d170607faf22287290045757e782089f0357b8f801d228d52c3", size = 58037, upload-time = "2025-07-27T13:02:59.753Z" },
{ url = "https://files.pythonhosted.org/packages/74/7e/0ac1850198c9c35ef631174009cee576f4d8afff3bf493ce310582976ab4/pybase64-1.4.2-cp311-cp311-musllinux_1_2_riscv64.whl", hash = "sha256:5c17b092e4da677a595178d2db17a5d2fafe5c8e418d46c0c4e4cde5adb8cff3", size = 54416, upload-time = "2025-07-27T13:03:00.978Z" },
{ url = "https://files.pythonhosted.org/packages/1b/45/b0b037f27e86c50e62d927f0bc1bde8b798dd55ab39197b116702e508d05/pybase64-1.4.2-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:120799274cf55f3f5bb8489eaa85142f26170564baafa7cf3e85541c46b6ab13", size = 56257, upload-time = "2025-07-27T13:03:02.201Z" },
{ url = "https://files.pythonhosted.org/packages/d2/0d/5034598aac56336d88fd5aaf6f34630330643b51d399336b8c788d798fc5/pybase64-1.4.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:522e4e712686acec2d25de9759dda0b0618cb9f6588523528bc74715c0245c7b", size = 70889, upload-time = "2025-07-27T13:03:03.437Z" },
{ url = "https://files.pythonhosted.org/packages/8a/3b/0645f21bb08ecf45635b624958b5f9e569069d31ecbf125dc7e0e5b83f60/pybase64-1.4.2-cp311-cp311-win32.whl", hash = "sha256:bfd828792982db8d787515535948c1e340f1819407c8832f94384c0ebeaf9d74", size = 33631, upload-time = "2025-07-27T13:03:05.194Z" },
{ url = "https://files.pythonhosted.org/packages/8f/08/24f8103c1f19e78761026cdd9f3b3be73239bc19cf5ab6fef0e8042d0bc6/pybase64-1.4.2-cp311-cp311-win_amd64.whl", hash = "sha256:7a9e89d40dbf833af481d1d5f1a44d173c9c4b56a7c8dba98e39a78ee87cfc52", size = 35781, upload-time = "2025-07-27T13:03:06.779Z" },
{ url = "https://files.pythonhosted.org/packages/66/cd/832fb035a0ea7eb53d776a5cfa961849e22828f6dfdfcdb9eb43ba3c0166/pybase64-1.4.2-cp311-cp311-win_arm64.whl", hash = "sha256:ce5809fa90619b03eab1cd63fec142e6cf1d361731a9b9feacf27df76c833343", size = 30903, upload-time = "2025-07-27T13:03:07.903Z" },
{ url = "https://files.pythonhosted.org/packages/28/6d/11ede991e800797b9f5ebd528013b34eee5652df93de61ffb24503393fa5/pybase64-1.4.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:db2c75d1388855b5a1015b65096d7dbcc708e7de3245dcbedeb872ec05a09326", size = 38326, upload-time = "2025-07-27T13:03:09.065Z" },
{ url = "https://files.pythonhosted.org/packages/fe/84/87f1f565f42e2397e2aaa2477c86419f5173c3699881c42325c090982f0a/pybase64-1.4.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:6b621a972a01841368fdb9dedc55fd3c6e0c7217d0505ba3b1ebe95e7ef1b493", size = 31661, upload-time = "2025-07-27T13:03:10.295Z" },
{ url = "https://files.pythonhosted.org/packages/cb/2a/a24c810e7a61d2cc6f73fe9ee4872a03030887fa8654150901b15f376f65/pybase64-1.4.2-cp312-cp312-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:f48c32ac6a16cbf57a5a96a073fef6ff7e3526f623cd49faa112b7f9980bafba", size = 68192, upload-time = "2025-07-27T13:03:11.467Z" },
{ url = "https://files.pythonhosted.org/packages/ee/87/d9baf98cbfc37b8657290ad4421f3a3c36aa0eafe4872c5859cfb52f3448/pybase64-1.4.2-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:ace8b23093a6bb862477080d9059b784096ab2f97541e8bfc40d42f062875149", size = 71587, upload-time = "2025-07-27T13:03:12.719Z" },
{ url = "https://files.pythonhosted.org/packages/0b/89/3df043cc56ef3b91b7aa0c26ae822a2d7ec8da0b0fd7c309c879b0eb5988/pybase64-1.4.2-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:1772c7532a7fb6301baea3dd3e010148dbf70cd1136a83c2f5f91bdc94822145", size = 59910, upload-time = "2025-07-27T13:03:14.266Z" },
{ url = "https://files.pythonhosted.org/packages/75/4f/6641e9edf37aeb4d4524dc7ba2168eff8d96c90e77f6283c2be3400ab380/pybase64-1.4.2-cp312-cp312-manylinux2014_armv7l.manylinux_2_17_armv7l.whl", hash = "sha256:f86f7faddcba5cbfea475f8ab96567834c28bf09ca6c7c3d66ee445adac80d8f", size = 56701, upload-time = "2025-07-27T13:03:15.6Z" },
{ url = "https://files.pythonhosted.org/packages/2d/7f/20d8ac1046f12420a0954a45a13033e75f98aade36eecd00c64e3549b071/pybase64-1.4.2-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:0b8c8e275b5294089f314814b4a50174ab90af79d6a4850f6ae11261ff6a7372", size = 59288, upload-time = "2025-07-27T13:03:16.823Z" },
{ url = "https://files.pythonhosted.org/packages/17/ea/9c0ca570e3e50b3c6c3442e280c83b321a0464c86a9db1f982a4ff531550/pybase64-1.4.2-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:864d85a0470c615807ae8b97d724d068b940a2d10ac13a5f1b9e75a3ce441758", size = 60267, upload-time = "2025-07-27T13:03:18.132Z" },
{ url = "https://files.pythonhosted.org/packages/f9/ac/46894929d71ccedebbfb0284173b0fea96bc029cd262654ba8451a7035d6/pybase64-1.4.2-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:47254d97ed2d8351e30ecfdb9e2414547f66ba73f8a09f932c9378ff75cd10c5", size = 54801, upload-time = "2025-07-27T13:03:19.669Z" },
{ url = "https://files.pythonhosted.org/packages/6a/1e/02c95218ea964f0b2469717c2c69b48e63f4ca9f18af01a5b2a29e4c1216/pybase64-1.4.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:264b65ecc4f0ee73f3298ab83bbd8008f7f9578361b8df5b448f985d8c63e02a", size = 58599, upload-time = "2025-07-27T13:03:20.951Z" },
{ url = "https://files.pythonhosted.org/packages/15/45/ccc21004930789b8fb439d43e3212a6c260ccddb2bf450c39a20db093f33/pybase64-1.4.2-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:fbcc2b30cd740c16c9699f596f22c7a9e643591311ae72b1e776f2d539e9dd9d", size = 52388, upload-time = "2025-07-27T13:03:23.064Z" },
{ url = "https://files.pythonhosted.org/packages/c4/45/22e46e549710c4c237d77785b6fb1bc4c44c288a5c44237ba9daf5c34b82/pybase64-1.4.2-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:cda9f79c22d51ee4508f5a43b673565f1d26af4330c99f114e37e3186fdd3607", size = 68802, upload-time = "2025-07-27T13:03:24.673Z" },
{ url = "https://files.pythonhosted.org/packages/55/0c/232c6261b81296e5593549b36e6e7884a5da008776d12665923446322c36/pybase64-1.4.2-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:0c91c6d2a7232e2a1cd10b3b75a8bb657defacd4295a1e5e80455df2dfc84d4f", size = 57841, upload-time = "2025-07-27T13:03:25.948Z" },
{ url = "https://files.pythonhosted.org/packages/20/8a/b35a615ae6f04550d696bb179c414538b3b477999435fdd4ad75b76139e4/pybase64-1.4.2-cp312-cp312-musllinux_1_2_riscv64.whl", hash = "sha256:a370dea7b1cee2a36a4d5445d4e09cc243816c5bc8def61f602db5a6f5438e52", size = 54320, upload-time = "2025-07-27T13:03:27.495Z" },
{ url = "https://files.pythonhosted.org/packages/d3/a9/8bd4f9bcc53689f1b457ecefed1eaa080e4949d65a62c31a38b7253d5226/pybase64-1.4.2-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:9aa4de83f02e462a6f4e066811c71d6af31b52d7484de635582d0e3ec3d6cc3e", size = 56482, upload-time = "2025-07-27T13:03:28.942Z" },
{ url = "https://files.pythonhosted.org/packages/75/e5/4a7735b54a1191f61c3f5c2952212c85c2d6b06eb5fb3671c7603395f70c/pybase64-1.4.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:83a1c2f9ed00fee8f064d548c8654a480741131f280e5750bb32475b7ec8ee38", size = 70959, upload-time = "2025-07-27T13:03:30.171Z" },
{ url = "https://files.pythonhosted.org/packages/d3/67/e2b6cb32c782e12304d467418e70da0212567f42bd4d3b5eb1fdf64920ad/pybase64-1.4.2-cp312-cp312-win32.whl", hash = "sha256:a6e5688b18d558e8c6b8701cc8560836c4bbeba61d33c836b4dba56b19423716", size = 33683, upload-time = "2025-07-27T13:03:31.775Z" },
{ url = "https://files.pythonhosted.org/packages/4f/bc/d5c277496063a09707486180f17abbdbdebbf2f5c4441b20b11d3cb7dc7c/pybase64-1.4.2-cp312-cp312-win_amd64.whl", hash = "sha256:c995d21b8bd08aa179cd7dd4db0695c185486ecc72da1e8f6c37ec86cadb8182", size = 35817, upload-time = "2025-07-27T13:03:32.99Z" },
{ url = "https://files.pythonhosted.org/packages/e6/69/e4be18ae685acff0ae77f75d4586590f29d2cd187bf603290cf1d635cad4/pybase64-1.4.2-cp312-cp312-win_arm64.whl", hash = "sha256:e254b9258c40509c2ea063a7784f6994988f3f26099d6e08704e3c15dfed9a55", size = 30900, upload-time = "2025-07-27T13:03:34.499Z" },
{ url = "https://files.pythonhosted.org/packages/f4/56/5337f27a8b8d2d6693f46f7b36bae47895e5820bfa259b0072574a4e1057/pybase64-1.4.2-cp313-cp313-android_21_arm64_v8a.whl", hash = "sha256:0f331aa59549de21f690b6ccc79360ffed1155c3cfbc852eb5c097c0b8565a2b", size = 33888, upload-time = "2025-07-27T13:03:35.698Z" },
{ url = "https://files.pythonhosted.org/packages/4c/09/f3f4b11fc9beda7e8625e29fb0f549958fcbb34fea3914e1c1d95116e344/pybase64-1.4.2-cp313-cp313-android_21_x86_64.whl", hash = "sha256:9dad20bf1f3ed9e6fe566c4c9d07d9a6c04f5a280daebd2082ffb8620b0a880d", size = 40796, upload-time = "2025-07-27T13:03:36.927Z" },
{ url = "https://files.pythonhosted.org/packages/e3/ff/470768f0fe6de0aa302a8cb1bdf2f9f5cffc3f69e60466153be68bc953aa/pybase64-1.4.2-cp313-cp313-ios_13_0_arm64_iphoneos.whl", hash = "sha256:69d3f0445b0faeef7bb7f93bf8c18d850785e2a77f12835f49e524cc54af04e7", size = 30914, upload-time = "2025-07-27T13:03:38.475Z" },
{ url = "https://files.pythonhosted.org/packages/75/6b/d328736662665e0892409dc410353ebef175b1be5eb6bab1dad579efa6df/pybase64-1.4.2-cp313-cp313-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:2372b257b1f4dd512f317fb27e77d313afd137334de64c87de8374027aacd88a", size = 31380, upload-time = "2025-07-27T13:03:39.7Z" },
{ url = "https://files.pythonhosted.org/packages/ca/96/7ff718f87c67f4147c181b73d0928897cefa17dc75d7abc6e37730d5908f/pybase64-1.4.2-cp313-cp313-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:fb794502b4b1ec91c4ca5d283ae71aef65e3de7721057bd9e2b3ec79f7a62d7d", size = 38230, upload-time = "2025-07-27T13:03:41.637Z" },
{ url = "https://files.pythonhosted.org/packages/4d/58/a3307b048d799ff596a3c7c574fcba66f9b6b8c899a3c00a698124ca7ad5/pybase64-1.4.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:d5c532b03fd14a5040d6cf6571299a05616f925369c72ddf6fe2fb643eb36fed", size = 38319, upload-time = "2025-07-27T13:03:42.847Z" },
{ url = "https://files.pythonhosted.org/packages/08/a7/0bda06341b0a2c830d348c6e1c4d348caaae86c53dc9a046e943467a05e9/pybase64-1.4.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:0f699514dc1d5689ca9cf378139e0214051922732f9adec9404bc680a8bef7c0", size = 31655, upload-time = "2025-07-27T13:03:44.426Z" },
{ url = "https://files.pythonhosted.org/packages/87/df/e1d6e8479e0c5113c2c63c7b44886935ce839c2d99884c7304ca9e86547c/pybase64-1.4.2-cp313-cp313-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:cd3e8713cbd32c8c6aa935feaf15c7670e2b7e8bfe51c24dc556811ebd293a29", size = 68232, upload-time = "2025-07-27T13:03:45.729Z" },
{ url = "https://files.pythonhosted.org/packages/71/ab/db4dbdfccb9ca874d6ce34a0784761471885d96730de85cee3d300381529/pybase64-1.4.2-cp313-cp313-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:d377d48acf53abf4b926c2a7a24a19deb092f366a04ffd856bf4b3aa330b025d", size = 71608, upload-time = "2025-07-27T13:03:47.01Z" },
{ url = "https://files.pythonhosted.org/packages/11/e9/508df958563951045d728bbfbd3be77465f9231cf805cb7ccaf6951fc9f1/pybase64-1.4.2-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:d83c076e78d619b9e1dd674e2bf5fb9001aeb3e0b494b80a6c8f6d4120e38cd9", size = 59912, upload-time = "2025-07-27T13:03:48.277Z" },
{ url = "https://files.pythonhosted.org/packages/f2/58/7f2cef1ceccc682088958448d56727369de83fa6b29148478f4d2acd107a/pybase64-1.4.2-cp313-cp313-manylinux2014_armv7l.manylinux_2_17_armv7l.whl", hash = "sha256:ab9cdb6a8176a5cb967f53e6ad60e40c83caaa1ae31c5e1b29e5c8f507f17538", size = 56413, upload-time = "2025-07-27T13:03:49.908Z" },
{ url = "https://files.pythonhosted.org/packages/08/7c/7e0af5c5728fa7e2eb082d88eca7c6bd17429be819d58518e74919d42e66/pybase64-1.4.2-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:adf0c103ad559dbfb9fe69edfd26a15c65d9c991a5ab0a25b04770f9eb0b9484", size = 59311, upload-time = "2025-07-27T13:03:51.238Z" },
{ url = "https://files.pythonhosted.org/packages/03/8b/09825d0f37e45b9a3f546e5f990b6cf2dd838e54ea74122c2464646e0c77/pybase64-1.4.2-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:0d03ef2f253d97ce0685d3624bf5e552d716b86cacb8a6c971333ba4b827e1fc", size = 60282, upload-time = "2025-07-27T13:03:52.56Z" },
{ url = "https://files.pythonhosted.org/packages/9c/3f/3711d2413f969bfd5b9cc19bc6b24abae361b7673ff37bcb90c43e199316/pybase64-1.4.2-cp313-cp313-manylinux_2_31_riscv64.whl", hash = "sha256:e565abf906efee76ae4be1aef5df4aed0fda1639bc0d7732a3dafef76cb6fc35", size = 54845, upload-time = "2025-07-27T13:03:54.167Z" },
{ url = "https://files.pythonhosted.org/packages/c6/3c/4c7ce1ae4d828c2bb56d144322f81bffbaaac8597d35407c3d7cbb0ff98f/pybase64-1.4.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:e3c6a5f15fd03f232fc6f295cce3684f7bb08da6c6d5b12cc771f81c9f125cc6", size = 58615, upload-time = "2025-07-27T13:03:55.494Z" },
{ url = "https://files.pythonhosted.org/packages/f5/8f/c2fc03bf4ed038358620065c75968a30184d5d3512d09d3ef9cc3bd48592/pybase64-1.4.2-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:bad9e3db16f448728138737bbd1af9dc2398efd593a8bdd73748cc02cd33f9c6", size = 52434, upload-time = "2025-07-27T13:03:56.808Z" },
{ url = "https://files.pythonhosted.org/packages/e2/0a/757d6df0a60327c893cfae903e15419914dd792092dc8cc5c9523d40bc9b/pybase64-1.4.2-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:2683ef271328365c31afee0ed8fa29356fb8fb7c10606794656aa9ffb95e92be", size = 68824, upload-time = "2025-07-27T13:03:58.735Z" },
{ url = "https://files.pythonhosted.org/packages/a0/14/84abe2ed8c29014239be1cfab45dfebe5a5ca779b177b8b6f779bd8b69da/pybase64-1.4.2-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:265b20089cd470079114c09bb74b101b3bfc3c94ad6b4231706cf9eff877d570", size = 57898, upload-time = "2025-07-27T13:04:00.379Z" },
{ url = "https://files.pythonhosted.org/packages/7e/c6/d193031f90c864f7b59fa6d1d1b5af41f0f5db35439988a8b9f2d1b32a13/pybase64-1.4.2-cp313-cp313-musllinux_1_2_riscv64.whl", hash = "sha256:e53173badead10ef8b839aa5506eecf0067c7b75ad16d9bf39bc7144631f8e67", size = 54319, upload-time = "2025-07-27T13:04:01.742Z" },
{ url = "https://files.pythonhosted.org/packages/cb/37/ec0c7a610ff8f994ee6e0c5d5d66b6b6310388b96ebb347b03ae39870fdf/pybase64-1.4.2-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:5823b8dcf74da7da0f761ed60c961e8928a6524e520411ad05fe7f9f47d55b40", size = 56472, upload-time = "2025-07-27T13:04:03.089Z" },
{ url = "https://files.pythonhosted.org/packages/c4/5a/e585b74f85cedd261d271e4c2ef333c5cfce7e80750771808f56fee66b98/pybase64-1.4.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:1237f66c54357d325390da60aa5e21c6918fbcd1bf527acb9c1f4188c62cb7d5", size = 70966, upload-time = "2025-07-27T13:04:04.361Z" },
{ url = "https://files.pythonhosted.org/packages/ad/20/1b2fdd98b4ba36008419668c813025758214c543e362c66c49214ecd1127/pybase64-1.4.2-cp313-cp313-win32.whl", hash = "sha256:b0b851eb4f801d16040047f6889cca5e9dfa102b3e33f68934d12511245cef86", size = 33681, upload-time = "2025-07-27T13:04:06.126Z" },
{ url = "https://files.pythonhosted.org/packages/ff/64/3df4067d169c047054889f34b5a946cbe3785bca43404b93c962a5461a41/pybase64-1.4.2-cp313-cp313-win_amd64.whl", hash = "sha256:19541c6e26d17d9522c02680fe242206ae05df659c82a657aabadf209cd4c6c7", size = 35822, upload-time = "2025-07-27T13:04:07.752Z" },
{ url = "https://files.pythonhosted.org/packages/d1/fd/db505188adf812e60ee923f196f9deddd8a1895b2b29b37f5db94afc3b1c/pybase64-1.4.2-cp313-cp313-win_arm64.whl", hash = "sha256:77a191863d576c0a5dd81f8a568a5ca15597cc980ae809dce62c717c8d42d8aa", size = 30899, upload-time = "2025-07-27T13:04:09.062Z" },
{ url = "https://files.pythonhosted.org/packages/d9/27/5f5fecd206ec1e06e1608a380af18dcb76a6ab08ade6597a3251502dcdb2/pybase64-1.4.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:2e194bbabe3fdf9e47ba9f3e157394efe0849eb226df76432126239b3f44992c", size = 38677, upload-time = "2025-07-27T13:04:10.334Z" },
{ url = "https://files.pythonhosted.org/packages/bf/0f/abe4b5a28529ef5f74e8348fa6a9ef27d7d75fbd98103d7664cf485b7d8f/pybase64-1.4.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:39aef1dadf4a004f11dd09e703abaf6528a87c8dbd39c448bb8aebdc0a08c1be", size = 32066, upload-time = "2025-07-27T13:04:11.641Z" },
{ url = "https://files.pythonhosted.org/packages/ac/7e/ea0ce6a7155cada5526017ec588b6d6185adea4bf9331565272f4ef583c2/pybase64-1.4.2-cp313-cp313t-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:91cb920c7143e36ec8217031282c8651da3b2206d70343f068fac0e7f073b7f9", size = 72300, upload-time = "2025-07-27T13:04:12.969Z" },
{ url = "https://files.pythonhosted.org/packages/45/2d/e64c7a056c9ec48dfe130d1295e47a8c2b19c3984488fc08e5eaa1e86c88/pybase64-1.4.2-cp313-cp313t-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:6958631143fb9e71f9842000da042ec2f6686506b6706e2dfda29e97925f6aa0", size = 75520, upload-time = "2025-07-27T13:04:14.374Z" },
{ url = "https://files.pythonhosted.org/packages/43/e0/e5f93b2e1cb0751a22713c4baa6c6eaf5f307385e369180486c8316ed21e/pybase64-1.4.2-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:dc35f14141ef3f1ac70d963950a278a2593af66fe5a1c7a208e185ca6278fa25", size = 65384, upload-time = "2025-07-27T13:04:16.204Z" },
{ url = "https://files.pythonhosted.org/packages/ff/23/8c645a1113ad88a1c6a3d0e825e93ef8b74ad3175148767853a0a4d7626e/pybase64-1.4.2-cp313-cp313t-manylinux2014_armv7l.manylinux_2_17_armv7l.whl", hash = "sha256:5d949d2d677859c3a8507e1b21432a039d2b995e0bd3fe307052b6ded80f207a", size = 60471, upload-time = "2025-07-27T13:04:17.947Z" },
{ url = "https://files.pythonhosted.org/packages/8b/81/edd0f7d8b0526b91730a0dd4ce6b4c8be2136cd69d424afe36235d2d2a06/pybase64-1.4.2-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:09caacdd3e15fe7253a67781edd10a6a918befab0052a2a3c215fe5d1f150269", size = 63945, upload-time = "2025-07-27T13:04:19.383Z" },
{ url = "https://files.pythonhosted.org/packages/a5/a5/edc224cd821fd65100b7af7c7e16b8f699916f8c0226c9c97bbae5a75e71/pybase64-1.4.2-cp313-cp313t-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:e44b0e793b23f28ea0f15a9754bd0c960102a2ac4bccb8fafdedbd4cc4d235c0", size = 64858, upload-time = "2025-07-27T13:04:20.807Z" },
{ url = "https://files.pythonhosted.org/packages/11/3b/92853f968f1af7e42b7e54d21bdd319097b367e7dffa2ca20787361df74c/pybase64-1.4.2-cp313-cp313t-manylinux_2_31_riscv64.whl", hash = "sha256:849f274d0bcb90fc6f642c39274082724d108e41b15f3a17864282bd41fc71d5", size = 58557, upload-time = "2025-07-27T13:04:22.229Z" },
{ url = "https://files.pythonhosted.org/packages/76/09/0ec6bd2b2303b0ea5c6da7535edc9a608092075ef8c0cdd96e3e726cd687/pybase64-1.4.2-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:528dba7ef1357bd7ce1aea143084501f47f5dd0fff7937d3906a68565aa59cfe", size = 63624, upload-time = "2025-07-27T13:04:23.952Z" },
{ url = "https://files.pythonhosted.org/packages/73/6e/52cb1ced2a517a3118b2e739e9417432049013ac7afa15d790103059e8e4/pybase64-1.4.2-cp313-cp313t-musllinux_1_2_armv7l.whl", hash = "sha256:1da54be743d9a68671700cfe56c3ab8c26e8f2f5cc34eface905c55bc3a9af94", size = 56174, upload-time = "2025-07-27T13:04:25.419Z" },
{ url = "https://files.pythonhosted.org/packages/5b/9d/820fe79347467e48af985fe46180e1dd28e698ade7317bebd66de8a143f5/pybase64-1.4.2-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:9b07c0406c3eaa7014499b0aacafb21a6d1146cfaa85d56f0aa02e6d542ee8f3", size = 72640, upload-time = "2025-07-27T13:04:26.824Z" },
{ url = "https://files.pythonhosted.org/packages/53/58/e863e10d08361e694935c815b73faad7e1ab03f99ae154d86c4e2f331896/pybase64-1.4.2-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:312f2aa4cf5d199a97fbcaee75d2e59ebbaafcd091993eb373b43683498cdacb", size = 62453, upload-time = "2025-07-27T13:04:28.562Z" },
{ url = "https://files.pythonhosted.org/packages/95/f0/c392c4ac8ccb7a34b28377c21faa2395313e3c676d76c382642e19a20703/pybase64-1.4.2-cp313-cp313t-musllinux_1_2_riscv64.whl", hash = "sha256:ad59362fc267bf15498a318c9e076686e4beeb0dfe09b457fabbc2b32468b97a", size = 58103, upload-time = "2025-07-27T13:04:29.996Z" },
{ url = "https://files.pythonhosted.org/packages/32/30/00ab21316e7df8f526aa3e3dc06f74de6711d51c65b020575d0105a025b2/pybase64-1.4.2-cp313-cp313t-musllinux_1_2_s390x.whl", hash = "sha256:01593bd064e7dcd6c86d04e94e44acfe364049500c20ac68ca1e708fbb2ca970", size = 60779, upload-time = "2025-07-27T13:04:31.549Z" },
{ url = "https://files.pythonhosted.org/packages/a6/65/114ca81839b1805ce4a2b7d58bc16e95634734a2059991f6382fc71caf3e/pybase64-1.4.2-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:5b81547ad8ea271c79fdf10da89a1e9313cb15edcba2a17adf8871735e9c02a0", size = 74684, upload-time = "2025-07-27T13:04:32.976Z" },
{ url = "https://files.pythonhosted.org/packages/54/8f/aa9d445b9bb693b8f6bb1456bd6d8576d79b7a63bf6c69af3a539235b15f/pybase64-1.4.2-cp313-cp313t-win32.whl", hash = "sha256:7edbe70b5654545a37e6e6b02de738303b1bbdfcde67f6cfec374cfb5cc4099e", size = 33961, upload-time = "2025-07-27T13:04:34.806Z" },
{ url = "https://files.pythonhosted.org/packages/0e/e5/da37cfb173c646fd4fc7c6aae2bc41d40de2ee49529854af8f4e6f498b45/pybase64-1.4.2-cp313-cp313t-win_amd64.whl", hash = "sha256:385690addf87c25d6366fab5d8ff512eed8a7ecb18da9e8152af1c789162f208", size = 36199, upload-time = "2025-07-27T13:04:36.223Z" },
{ url = "https://files.pythonhosted.org/packages/66/3e/1eb68fb7d00f2cec8bd9838e2a30d183d6724ae06e745fd6e65216f170ff/pybase64-1.4.2-cp313-cp313t-win_arm64.whl", hash = "sha256:c2070d0aa88580f57fe15ca88b09f162e604d19282915a95a3795b5d3c1c05b5", size = 31221, upload-time = "2025-07-27T13:04:37.704Z" },
{ url = "https://files.pythonhosted.org/packages/99/bf/00a87d951473ce96c8c08af22b6983e681bfabdb78dd2dcf7ee58eac0932/pybase64-1.4.2-cp314-cp314-ios_13_0_arm64_iphoneos.whl", hash = "sha256:4157ad277a32cf4f02a975dffc62a3c67d73dfa4609b2c1978ef47e722b18b8e", size = 30924, upload-time = "2025-07-27T13:04:39.189Z" },
{ url = "https://files.pythonhosted.org/packages/ae/43/dee58c9d60e60e6fb32dc6da722d84592e22f13c277297eb4ce6baf99a99/pybase64-1.4.2-cp314-cp314-ios_13_0_arm64_iphonesimulator.whl", hash = "sha256:e113267dc349cf624eb4f4fbf53fd77835e1aa048ac6877399af426aab435757", size = 31390, upload-time = "2025-07-27T13:04:40.995Z" },
{ url = "https://files.pythonhosted.org/packages/e1/11/b28906fc2e330b8b1ab4bc845a7bef808b8506734e90ed79c6062b095112/pybase64-1.4.2-cp314-cp314-ios_13_0_x86_64_iphonesimulator.whl", hash = "sha256:cea5aaf218fd9c5c23afacfe86fd4464dfedc1a0316dd3b5b4075b068cc67df0", size = 38212, upload-time = "2025-07-27T13:04:42.729Z" },
{ url = "https://files.pythonhosted.org/packages/24/9e/868d1e104413d14b19feaf934fc7fad4ef5b18946385f8bb79684af40f24/pybase64-1.4.2-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:41213497abbd770435c7a9c8123fb02b93709ac4cf60155cd5aefc5f3042b600", size = 38303, upload-time = "2025-07-27T13:04:44.095Z" },
{ url = "https://files.pythonhosted.org/packages/a3/73/f7eac96ca505df0600280d6bfc671a9e2e2f947c2b04b12a70e36412f7eb/pybase64-1.4.2-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:c8b522df7ee00f2ac1993ccd5e1f6608ae7482de3907668c2ff96a83ef213925", size = 31669, upload-time = "2025-07-27T13:04:45.845Z" },
{ url = "https://files.pythonhosted.org/packages/c6/43/8e18bea4fd455100112d6a73a83702843f067ef9b9272485b6bdfd9ed2f0/pybase64-1.4.2-cp314-cp314-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:06725022e540c5b098b978a0418ca979773e2cbdbb76f10bd97536f2ad1c5b49", size = 68452, upload-time = "2025-07-27T13:04:47.788Z" },
{ url = "https://files.pythonhosted.org/packages/e4/2e/851eb51284b97354ee5dfa1309624ab90920696e91a33cd85b13d20cc5c1/pybase64-1.4.2-cp314-cp314-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:a3e54dcf0d0305ec88473c9d0009f698cabf86f88a8a10090efeff2879c421bb", size = 71674, upload-time = "2025-07-27T13:04:49.294Z" },
{ url = "https://files.pythonhosted.org/packages/57/0d/5cf1e5dc64aec8db43e8dee4e4046856d639a72bcb0fb3e716be42ced5f1/pybase64-1.4.2-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:67675cee727a60dc91173d2790206f01aa3c7b3fbccfa84fd5c1e3d883fe6caa", size = 60027, upload-time = "2025-07-27T13:04:50.769Z" },
{ url = "https://files.pythonhosted.org/packages/a4/8e/3479266bc0e65f6cc48b3938d4a83bff045330649869d950a378f2ddece0/pybase64-1.4.2-cp314-cp314-manylinux2014_armv7l.manylinux_2_17_armv7l.whl", hash = "sha256:753da25d4fd20be7bda2746f545935773beea12d5cb5ec56ec2d2960796477b1", size = 56461, upload-time = "2025-07-27T13:04:52.37Z" },
{ url = "https://files.pythonhosted.org/packages/20/b6/f2b6cf59106dd78bae8717302be5b814cec33293504ad409a2eb752ad60c/pybase64-1.4.2-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:a78c768ce4ca550885246d14babdb8923e0f4a848dfaaeb63c38fc99e7ea4052", size = 59446, upload-time = "2025-07-27T13:04:53.967Z" },
{ url = "https://files.pythonhosted.org/packages/16/70/3417797dfccdfdd0a54e4ad17c15b0624f0fc2d6a362210f229f5c4e8fd0/pybase64-1.4.2-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:51b17f36d890c92f0618fb1c8db2ccc25e6ed07afa505bab616396fc9b0b0492", size = 60350, upload-time = "2025-07-27T13:04:55.881Z" },
{ url = "https://files.pythonhosted.org/packages/a0/c6/6e4269dd98d150ae95d321b311a345eae0f7fd459d97901b4a586d7513bb/pybase64-1.4.2-cp314-cp314-manylinux_2_31_riscv64.whl", hash = "sha256:f92218d667049ab4f65d54fa043a88ffdb2f07fff1f868789ef705a5221de7ec", size = 54989, upload-time = "2025-07-27T13:04:57.436Z" },
{ url = "https://files.pythonhosted.org/packages/f9/e8/18c1b0c255f964fafd0412b0d5a163aad588aeccb8f84b9bf9c8611d80f6/pybase64-1.4.2-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:3547b3d1499919a06491b3f879a19fbe206af2bd1a424ecbb4e601eb2bd11fea", size = 58724, upload-time = "2025-07-27T13:04:59.406Z" },
{ url = "https://files.pythonhosted.org/packages/b1/ad/ddfbd2125fc20b94865fb232b2e9105376fa16eee492e4b7786d42a86cbf/pybase64-1.4.2-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:958af7b0e09ddeb13e8c2330767c47b556b1ade19c35370f6451d139cde9f2a9", size = 52285, upload-time = "2025-07-27T13:05:01.198Z" },
{ url = "https://files.pythonhosted.org/packages/b6/4c/b9d4ec9224add33c84b925a03d1a53cd4106efb449ea8e0ae7795fed7bf7/pybase64-1.4.2-cp314-cp314-musllinux_1_2_i686.whl", hash = "sha256:4facc57f6671e2229a385a97a618273e7be36a9ea0a9d1c1b9347f14d19ceba8", size = 69036, upload-time = "2025-07-27T13:05:03.109Z" },
{ url = "https://files.pythonhosted.org/packages/92/38/7b96794da77bed3d9b4fea40f14ae563648fba83a696e7602fabe60c0eb7/pybase64-1.4.2-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:a32fc57d05d73a7c9b0ca95e9e265e21cf734195dc6873829a890058c35f5cfd", size = 57938, upload-time = "2025-07-27T13:05:04.744Z" },
{ url = "https://files.pythonhosted.org/packages/eb/c5/ae8bbce3c322d1b074e79f51f5df95961fe90cb8748df66c6bc97616e974/pybase64-1.4.2-cp314-cp314-musllinux_1_2_riscv64.whl", hash = "sha256:3dc853243c81ce89cc7318e6946f860df28ddb7cd2a0648b981652d9ad09ee5a", size = 54474, upload-time = "2025-07-27T13:05:06.662Z" },
{ url = "https://files.pythonhosted.org/packages/15/9a/c09887c4bb1b43c03fc352e2671ef20c6686c6942a99106a45270ee5b840/pybase64-1.4.2-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:0e6d863a86b3e7bc6ac9bd659bebda4501b9da842521111b0b0e54eb51295df5", size = 56533, upload-time = "2025-07-27T13:05:08.368Z" },
{ url = "https://files.pythonhosted.org/packages/4f/0f/d5114d63d35d085639606a880cb06e2322841cd4b213adfc14d545c1186f/pybase64-1.4.2-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:6579475140ff2067903725d8aca47f5747bcb211597a1edd60b58f6d90ada2bd", size = 71030, upload-time = "2025-07-27T13:05:10.3Z" },
{ url = "https://files.pythonhosted.org/packages/40/0e/fe6f1ed22ea52eb99f490a8441815ba21de288f4351aeef4968d71d20d2d/pybase64-1.4.2-cp314-cp314-win32.whl", hash = "sha256:373897f728d7b4f241a1f803ac732c27b6945d26d86b2741ad9b75c802e4e378", size = 34174, upload-time = "2025-07-27T13:05:12.254Z" },
{ url = "https://files.pythonhosted.org/packages/71/46/0e15bea52ffc63e8ae7935e945accbaf635e0aefa26d3e31fdf9bc9dcd01/pybase64-1.4.2-cp314-cp314-win_amd64.whl", hash = "sha256:1afe3361344617d298c1d08bc657ef56d0f702d6b72cb65d968b2771017935aa", size = 36308, upload-time = "2025-07-27T13:05:13.898Z" },
{ url = "https://files.pythonhosted.org/packages/4f/dc/55849fee2577bda77c1e078da04cc9237e8e474a8c8308deb702a26f2511/pybase64-1.4.2-cp314-cp314-win_arm64.whl", hash = "sha256:f131c9360babe522f3d90f34da3f827cba80318125cf18d66f2ee27e3730e8c4", size = 31341, upload-time = "2025-07-27T13:05:15.553Z" },
{ url = "https://files.pythonhosted.org/packages/39/44/c69d088e28b25e70ac742b6789cde038473815b2a69345c4bae82d5e244d/pybase64-1.4.2-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:2583ac304131c1bd6e3120b0179333610f18816000db77c0a2dd6da1364722a8", size = 38678, upload-time = "2025-07-27T13:05:17.544Z" },
{ url = "https://files.pythonhosted.org/packages/00/93/2860ec067497b9cbb06242f96d44caebbd9eed32174e4eb8c1ffef760f94/pybase64-1.4.2-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:75a8116be4ea4cdd30a5c4f1a6f3b038e0d457eb03c8a2685d8ce2aa00ef8f92", size = 32066, upload-time = "2025-07-27T13:05:19.18Z" },
{ url = "https://files.pythonhosted.org/packages/d3/55/1e96249a38759332e8a01b31c370d88c60ceaf44692eb6ba4f0f451ee496/pybase64-1.4.2-cp314-cp314t-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:217ea776a098d7c08668e5526b9764f5048bbfd28cac86834217ddfe76a4e3c4", size = 72465, upload-time = "2025-07-27T13:05:20.866Z" },
{ url = "https://files.pythonhosted.org/packages/6d/ab/0f468605b899f3e35dbb7423fba3ff98aeed1ec16abb02428468494a58f4/pybase64-1.4.2-cp314-cp314t-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:4ec14683e343c95b14248cdfdfa78c052582be7a3865fd570aa7cffa5ab5cf37", size = 75693, upload-time = "2025-07-27T13:05:22.896Z" },
{ url = "https://files.pythonhosted.org/packages/91/d1/9980a0159b699e2489baba05b71b7c953b29249118ba06fdbb3e9ea1b9b5/pybase64-1.4.2-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:480ecf21e1e956c5a10d3cf7b3b7e75bce3f9328cf08c101e4aab1925d879f34", size = 65577, upload-time = "2025-07-27T13:05:25Z" },
{ url = "https://files.pythonhosted.org/packages/16/86/b27e7b95f9863d245c0179a7245582eda3d262669d8f822777364d8fd7d5/pybase64-1.4.2-cp314-cp314t-manylinux2014_armv7l.manylinux_2_17_armv7l.whl", hash = "sha256:1fe1ebdc55e9447142e2f6658944aadfb5a4fbf03dbd509be34182585515ecc1", size = 60662, upload-time = "2025-07-27T13:05:27.138Z" },
{ url = "https://files.pythonhosted.org/packages/28/87/a7f0dde0abc26bfbee761f1d3558eb4b139f33ddd9fe1f6825ffa7daa22d/pybase64-1.4.2-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.whl", hash = "sha256:c793a2b06753accdaf5e1a8bbe5d800aab2406919e5008174f989a1ca0081411", size = 64179, upload-time = "2025-07-27T13:05:28.996Z" },
{ url = "https://files.pythonhosted.org/packages/1e/88/5d6fa1c60e1363b4cac4c396978f39e9df4689e75225d7d9c0a5998e3a14/pybase64-1.4.2-cp314-cp314t-manylinux2014_s390x.manylinux_2_17_s390x.whl", hash = "sha256:6acae6e1d1f7ebe40165f08076c7a73692b2bf9046fefe673f350536e007f556", size = 64968, upload-time = "2025-07-27T13:05:30.818Z" },
{ url = "https://files.pythonhosted.org/packages/20/6e/2ed585af5b2211040445d9849326dd2445320c9316268794f5453cfbaf30/pybase64-1.4.2-cp314-cp314t-manylinux_2_31_riscv64.whl", hash = "sha256:88b91cd0949358aadcea75f8de5afbcf3c8c5fb9ec82325bd24285b7119cf56e", size = 58738, upload-time = "2025-07-27T13:05:32.629Z" },
{ url = "https://files.pythonhosted.org/packages/ce/94/e2960b56322eabb3fbf303fc5a72e6444594c1b90035f3975c6fe666db5c/pybase64-1.4.2-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:53316587e1b1f47a11a5ff068d3cbd4a3911c291f2aec14882734973684871b2", size = 63802, upload-time = "2025-07-27T13:05:34.687Z" },
{ url = "https://files.pythonhosted.org/packages/95/47/312139d764c223f534f751528ce3802887c279125eac64f71cd3b4e05abc/pybase64-1.4.2-cp314-cp314t-musllinux_1_2_armv7l.whl", hash = "sha256:caa7f20f43d00602cf9043b5ba758d54f5c41707d3709b2a5fac17361579c53c", size = 56341, upload-time = "2025-07-27T13:05:36.554Z" },
{ url = "https://files.pythonhosted.org/packages/3f/d7/aec9a6ed53b128dac32f8768b646ca5730c88eef80934054d7fa7d02f3ef/pybase64-1.4.2-cp314-cp314t-musllinux_1_2_i686.whl", hash = "sha256:2d93817e24fdd79c534ed97705df855af6f1d2535ceb8dfa80da9de75482a8d7", size = 72838, upload-time = "2025-07-27T13:05:38.459Z" },
{ url = "https://files.pythonhosted.org/packages/e3/a8/6ccc54c5f1f7c3450ad7c56da10c0f131d85ebe069ea6952b5b42f2e92d9/pybase64-1.4.2-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:63cd769b51474d8d08f7f2ce73b30380d9b4078ec92ea6b348ea20ed1e1af88a", size = 62633, upload-time = "2025-07-27T13:05:40.624Z" },
{ url = "https://files.pythonhosted.org/packages/34/22/2b9d89f8ff6f2a01d6d6a88664b20a4817049cfc3f2c62caca040706660c/pybase64-1.4.2-cp314-cp314t-musllinux_1_2_riscv64.whl", hash = "sha256:cd07e6a9993c392ec8eb03912a43c6a6b21b2deb79ee0d606700fe276e9a576f", size = 58282, upload-time = "2025-07-27T13:05:42.565Z" },
{ url = "https://files.pythonhosted.org/packages/b2/14/dbf6266177532a6a11804ac080ebffcee272f491b92820c39886ee20f201/pybase64-1.4.2-cp314-cp314t-musllinux_1_2_s390x.whl", hash = "sha256:6a8944e8194adff4668350504bc6b7dbde2dab9244c88d99c491657d145b5af5", size = 60948, upload-time = "2025-07-27T13:05:44.48Z" },
{ url = "https://files.pythonhosted.org/packages/fd/7a/b2ae9046a66dd5746cd72836a41386517b1680bea5ce02f2b4f1c9ebc688/pybase64-1.4.2-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:04ab398ec4b6a212af57f6a21a6336d5a1d754ff4ccb215951366ab9080481b2", size = 74854, upload-time = "2025-07-27T13:05:46.416Z" },
{ url = "https://files.pythonhosted.org/packages/ef/7e/9856f6d6c38a7b730e001123d2d9fa816b8b1a45f0cdee1d509d5947b047/pybase64-1.4.2-cp314-cp314t-win32.whl", hash = "sha256:3b9201ecdcb1c3e23be4caebd6393a4e6615bd0722528f5413b58e22e3792dd3", size = 34490, upload-time = "2025-07-27T13:05:48.304Z" },
{ url = "https://files.pythonhosted.org/packages/c7/38/8523a9dc1ec8704dedbe5ccc95192ae9a7585f7eec85cc62946fe3cacd32/pybase64-1.4.2-cp314-cp314t-win_amd64.whl", hash = "sha256:36e9b0cad8197136d73904ef5a71d843381d063fd528c5ab203fc4990264f682", size = 36680, upload-time = "2025-07-27T13:05:50.264Z" },
{ url = "https://files.pythonhosted.org/packages/3c/52/5600104ef7b85f89fb8ec54f73504ead3f6f0294027e08d281f3cafb5c1a/pybase64-1.4.2-cp314-cp314t-win_arm64.whl", hash = "sha256:f25140496b02db0e7401567cd869fb13b4c8118bf5c2428592ec339987146d8b", size = 31600, upload-time = "2025-07-27T13:05:52.24Z" },
{ url = "https://files.pythonhosted.org/packages/32/34/b67371f4fcedd5e2def29b1cf92a4311a72f590c04850f370c75297b48ce/pybase64-1.4.2-graalpy311-graalpy242_311_native-macosx_10_9_x86_64.whl", hash = "sha256:b4eed40a5f1627ee65613a6ac834a33f8ba24066656f569c852f98eb16f6ab5d", size = 38667, upload-time = "2025-07-27T13:07:25.315Z" },
{ url = "https://files.pythonhosted.org/packages/aa/3e/e57fe09ed1c7e740d21c37023c5f7c8963b4c36380f41d10261cc76f93b4/pybase64-1.4.2-graalpy311-graalpy242_311_native-macosx_11_0_arm64.whl", hash = "sha256:57885fa521e9add235af4db13e9e048d3a2934cd27d7c5efac1925e1b4d6538d", size = 32094, upload-time = "2025-07-27T13:07:28.235Z" },
{ url = "https://files.pythonhosted.org/packages/51/34/f40d3262c3953814b9bcdcf858436bd5bc1133a698be4bcc7ed2a8c0730d/pybase64-1.4.2-graalpy311-graalpy242_311_native-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:eef9255d926c64e2fca021d3aee98023bacb98e1518e5986d6aab04102411b04", size = 43212, upload-time = "2025-07-27T13:07:31.327Z" },
{ url = "https://files.pythonhosted.org/packages/8c/2a/5e05d25718cb8ffd68bd46553ddfd2b660893d937feda1716b8a3b21fb38/pybase64-1.4.2-graalpy311-graalpy242_311_native-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:89614ea2d2329b6708746c540e0f14d692125df99fb1203ff0de948d9e68dfc9", size = 35789, upload-time = "2025-07-27T13:07:34.026Z" },
{ url = "https://files.pythonhosted.org/packages/d5/9d/f56c3ee6e94faaae2896ecaf666428330cb24096abf7d2427371bb2b403a/pybase64-1.4.2-graalpy311-graalpy242_311_native-win_amd64.whl", hash = "sha256:e401cecd2d7ddcd558768b2140fd4430746be4d17fb14c99eec9e40789df136d", size = 35861, upload-time = "2025-07-27T13:07:37.099Z" },
{ url = "https://files.pythonhosted.org/packages/fb/04/bfe2bd0d76385750f3541724b4abfe4ea111b3cc01ff7e83f410054adc30/pybase64-1.4.2-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:4b29c93414ba965777643a9d98443f08f76ac04519ad717aa859113695372a07", size = 38226, upload-time = "2025-07-27T13:07:40.121Z" },
{ url = "https://files.pythonhosted.org/packages/22/13/c717855760b78ded1a9d308984c7e3e99fcf79c6cac5a231ed8c1238218f/pybase64-1.4.2-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:5e0c3353c0bf099c5c3f8f750202c486abee8f23a566b49e9e7b1222fbf5f259", size = 31524, upload-time = "2025-07-27T13:07:43.946Z" },
{ url = "https://files.pythonhosted.org/packages/cf/da/2b7e69abfc62abe4d54b10d1e09ec78021a6b9b2d7e6e7b632243a19433e/pybase64-1.4.2-pp310-pypy310_pp73-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:4f98c5c6152d3c01d933fcde04322cd9ddcf65b5346034aac69a04c1a7cbb012", size = 40667, upload-time = "2025-07-27T13:07:46.715Z" },
{ url = "https://files.pythonhosted.org/packages/f1/11/ba738655fb3ba85c7a0605eddd2709fef606e654840c72ee5c5ff7ab29bf/pybase64-1.4.2-pp310-pypy310_pp73-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:9096a4977b7aff7ef250f759fb6a4b6b7b6199d99c84070c7fc862dd3b208b34", size = 41290, upload-time = "2025-07-27T13:07:49.534Z" },
{ url = "https://files.pythonhosted.org/packages/5d/38/2d5502fcaf712297b95c1b6ca924656dd7d17501fd7f9c9e0b3bbf8892ef/pybase64-1.4.2-pp310-pypy310_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:49d8597e2872966399410502310b1e2a5b7e8d8ba96766ee1fe242e00bd80775", size = 35438, upload-time = "2025-07-27T13:07:52.327Z" },
{ url = "https://files.pythonhosted.org/packages/b6/db/e03b8b6daa60a3fbef21741403e0cf18b2aff3beebdf6e3596bb9bab16c7/pybase64-1.4.2-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:2ef16366565389a287df82659e055e88bdb6c36e46a3394950903e0a9cb2e5bf", size = 36121, upload-time = "2025-07-27T13:07:55.54Z" },
{ url = "https://files.pythonhosted.org/packages/0e/bf/5ebaa2d9ddb5fc506633bc8b820fc27e64da964937fb30929c0367c47d00/pybase64-1.4.2-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:0a5393be20b0705870f5a8969749af84d734c077de80dd7e9f5424a247afa85e", size = 38162, upload-time = "2025-07-27T13:07:58.364Z" },
{ url = "https://files.pythonhosted.org/packages/25/41/795c5fd6e5571bb675bf9add8a048166dddf8951c2a903fea8557743886b/pybase64-1.4.2-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:448f0259a2f1a17eb086f70fe2ad9b556edba1fc5bc4e62ce6966179368ee9f8", size = 31452, upload-time = "2025-07-27T13:08:01.259Z" },
{ url = "https://files.pythonhosted.org/packages/aa/dd/c819003b59b2832256b72ad23cbeadbd95d083ef0318d07149a58b7a88af/pybase64-1.4.2-pp311-pypy311_pp73-manylinux1_i686.manylinux2014_i686.manylinux_2_17_i686.manylinux_2_5_i686.whl", hash = "sha256:1159e70cba8e76c3d8f334bd1f8fd52a1bb7384f4c3533831b23ab2df84a6ef3", size = 40668, upload-time = "2025-07-27T13:08:04.176Z" },
{ url = "https://files.pythonhosted.org/packages/0e/c5/38c6aba28678c4a4db49312a6b8171b93a0ffe9f21362cf4c0f325caa850/pybase64-1.4.2-pp311-pypy311_pp73-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:7d943bc5dad8388971494554b97f22ae06a46cc7779ad0de3d4bfdf7d0bbea30", size = 41281, upload-time = "2025-07-27T13:08:07.395Z" },
{ url = "https://files.pythonhosted.org/packages/e5/23/5927bd9e59714e4e8cefd1d21ccd7216048bb1c6c3e7104b1b200afdc63d/pybase64-1.4.2-pp311-pypy311_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:10b99182c561d86422c5de4265fd1f8f172fb38efaed9d72c71fb31e279a7f94", size = 35433, upload-time = "2025-07-27T13:08:10.551Z" },
{ url = "https://files.pythonhosted.org/packages/01/0f/fab7ed5bf4926523c3b39f7621cea3e0da43f539fbc2270e042f1afccb79/pybase64-1.4.2-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:bb082c1114f046e59fcbc4f2be13edc93b36d7b54b58605820605be948f8fdf6", size = 36131, upload-time = "2025-07-27T13:08:13.777Z" },
]
[[package]]
name = "pycairo"
version = "1.28.0"
@@ -6560,16 +6483,16 @@ wheels = [
[[package]]
name = "smithy-aws-core"
version = "0.1.0"
version = "0.1.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "aws-sdk-signers", marker = "python_full_version >= '3.12'" },
{ name = "smithy-core", marker = "python_full_version >= '3.12'" },
{ name = "smithy-http", marker = "python_full_version >= '3.12'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/ec/e8/8cef48be92ed09a112c54747a4515313ba96e767e7e0118a769aeb147e07/smithy_aws_core-0.1.0.tar.gz", hash = "sha256:5f197b69ad1380e9118e1e3c9032e0e305525ef56fb4fc97dea6414281065526", size = 11135, upload-time = "2025-09-29T19:37:13.072Z" }
sdist = { url = "https://files.pythonhosted.org/packages/56/d3/f847e0fd36b95aa36ce3a4c9ce1a08e16b2aa9a56b71714045c9c924e282/smithy_aws_core-0.1.1.tar.gz", hash = "sha256:78dfd7040fc2bc72b6af293096642fc9a7bfd2db28ddbdf7c4110535eab9d662", size = 11196, upload-time = "2025-10-21T20:21:18.648Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/46/7e/6d05275646bc2cdf7b0749e9bd54958a4e808aafeee4d8ff2fdaa8233dc2/smithy_aws_core-0.1.0-py3-none-any.whl", hash = "sha256:a8cda4011562f45f1fc5957c3a981b6016d736178450e5f2a1586937632af487", size = 18959, upload-time = "2025-09-29T19:37:12.041Z" },
{ url = "https://files.pythonhosted.org/packages/d0/04/87cb06f0f6d664b5cffdef6d4042dd52c11c138436084d30ffdaa3543031/smithy_aws_core-0.1.1-py3-none-any.whl", hash = "sha256:0d1634f276c2999dc2a04fafef63b9d28309de50d939d1d49df952773a7063c4", size = 18963, upload-time = "2025-10-21T20:21:17.692Z" },
]
[package.optional-dependencies]
@@ -6603,14 +6526,14 @@ wheels = [
[[package]]
name = "smithy-http"
version = "0.1.0"
version = "0.2.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "smithy-core", marker = "python_full_version >= '3.12'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/4e/62/5ba46c7432fbb0852acf8340402879ba53bb4c009b875e1b5b2e9df844ff/smithy_http-0.1.0.tar.gz", hash = "sha256:ed44552531f594e31101f7186c7b01b508ecd38a860b45390a1cce7da700df4b", size = 28269, upload-time = "2025-09-29T19:37:18.629Z" }
sdist = { url = "https://files.pythonhosted.org/packages/3c/1c/44e99a7dfb8c39bf0c3d998accdf4573a7a3488863b90f21af260cec2d45/smithy_http-0.2.0.tar.gz", hash = "sha256:2382562fa9af326be455f14b18615f16ffe9db756e51b2a4ca0d23e1b881cff8", size = 26729, upload-time = "2025-10-21T20:21:06.146Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/5b/23/d18076ea45b3000c5e9eb8ebd75a4ea1b65b5c59e5c2080a119e2679dfba/smithy_http-0.1.0-py3-none-any.whl", hash = "sha256:7657aaf4b9e025cb9d317406f417b49cf19fba9d1b2ab4f5e6d9dc5a2dd7cdba", size = 38995, upload-time = "2025-09-29T19:37:17.506Z" },
{ url = "https://files.pythonhosted.org/packages/d4/e2/d475fad81ac74ec0e145cb6d72afe5ecde4e2358bd632c2fd5d3f4bc87dc/smithy_http-0.2.0-py3-none-any.whl", hash = "sha256:49ee2402d7737798d70f99f491fbfb2a5767283ae562e21b6f86e3fd14f3e3e0", size = 37328, upload-time = "2025-10-21T20:21:05.362Z" },
]
[package.optional-dependencies]
@@ -6696,14 +6619,14 @@ wheels = [
[[package]]
name = "speechmatics-rt"
version = "0.4.2"
version = "0.5.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "websockets" },
]
sdist = { url = "https://files.pythonhosted.org/packages/17/2e/d694390d58b9b6807280441d1275856f5a316c3e8a815c2037502636bbea/speechmatics_rt-0.4.2.tar.gz", hash = "sha256:c0f7ed34442b0f505a12d1b19c8cc8dc2cc0b1a423aeb5669ca0738fc5e59f0d", size = 26142, upload-time = "2025-09-30T10:50:36.804Z" }
sdist = { url = "https://files.pythonhosted.org/packages/57/26/10359e1f16c2aa6a198eb11a9056f4a86a8bb8d4e610bbbe4a118b227b59/speechmatics_rt-0.5.0.tar.gz", hash = "sha256:ca974a186a012f946fd997deeaf3bf1c4f203f6d6e05a866172d27709183afc8", size = 26832, upload-time = "2025-10-15T15:54:25.695Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/9a/c7/2cd551c71e14256ca463f31feec17f466b57c2730d636e20803e7a541104/speechmatics_rt-0.4.2-py3-none-any.whl", hash = "sha256:70b91ff750e2f7516eaf1839d39f7a8ac65ff6665638b837cf67bab9cc9967bc", size = 32131, upload-time = "2025-09-30T10:50:35.656Z" },
{ url = "https://files.pythonhosted.org/packages/47/2e/9931ebe9360e9d385c68826b33137c2c9a4cfa361cd929d1ac6e72ebfe53/speechmatics_rt-0.5.0-py3-none-any.whl", hash = "sha256:58151488f891fa00cf7054f0cfab1b1eb94b55c3441be587f7941c726caef991", size = 32850, upload-time = "2025-10-15T15:54:24.5Z" },
]
[[package]]
@@ -7528,7 +7451,7 @@ wheels = [
[[package]]
name = "vllm"
version = "0.9.2"
version = "0.9.1"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "aiohttp" },
@@ -7552,6 +7475,10 @@ dependencies = [
{ name = "numpy" },
{ name = "openai" },
{ name = "opencv-python-headless" },
{ name = "opentelemetry-api" },
{ name = "opentelemetry-exporter-otlp" },
{ name = "opentelemetry-sdk" },
{ name = "opentelemetry-semantic-conventions-ai" },
{ name = "outlines" },
{ name = "partial-json-parser" },
{ name = "pillow" },
@@ -7560,7 +7487,6 @@ dependencies = [
{ name = "protobuf" },
{ name = "psutil" },
{ name = "py-cpuinfo" },
{ name = "pybase64" },
{ name = "pydantic" },
{ name = "python-json-logger" },
{ name = "pyyaml" },
@@ -7583,11 +7509,11 @@ dependencies = [
{ name = "typing-extensions" },
{ name = "watchfiles" },
{ name = "xformers", marker = "platform_machine == 'x86_64' and sys_platform == 'linux'" },
{ name = "xgrammar", marker = "platform_machine == 'aarch64' or platform_machine == 'arm64' or platform_machine == 'x86_64'" },
{ name = "xgrammar", marker = "platform_machine == 'aarch64' or platform_machine == 'x86_64'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/35/89/2fbf95d398b5751b44c7256bd80e57c589142f1bfcc15f5dc76438b8853a/vllm-0.9.2.tar.gz", hash = "sha256:6b0d855ea8ba18d76364c9b82ea94bfcaa9c9e724055438b5733e4716ed104e1", size = 8997087, upload-time = "2025-07-08T04:49:01.722Z" }
sdist = { url = "https://files.pythonhosted.org/packages/c5/5b/5f42b41d045c01821be62162fc6b1cfb14db1674027c7b623adb3a66dccf/vllm-0.9.1.tar.gz", hash = "sha256:c5ad11603f49a1fad05c88debabb8b839780403ce1b51751ec4da4e8a838082c", size = 8670972, upload-time = "2025-06-10T21:46:12.114Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/f4/72/c14ff1acac64294f45782769b9c8144a1c3e8d4f2228d4648197511b015a/vllm-0.9.2-cp38-abi3-manylinux1_x86_64.whl", hash = "sha256:f3c5da29a286f4933b480a5b4749fab226564f35c96928eeef547f88d385cd34", size = 383350132, upload-time = "2025-07-08T04:48:54.133Z" },
{ url = "https://files.pythonhosted.org/packages/b5/56/ffcf6215a571cf9aa58ded06a9640bff21b4918e27344677cd33290ab9da/vllm-0.9.1-cp38-abi3-manylinux1_x86_64.whl", hash = "sha256:28b99e8df39c7aaeda04f7e5353b18564a1a9d1c579691945523fc4777a1a8c8", size = 394637693, upload-time = "2025-06-10T21:46:01.784Z" },
]
[[package]]
@@ -7897,7 +7823,6 @@ name = "xgrammar"
version = "0.1.19"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "mlx-lm", marker = "platform_machine == 'arm64' and sys_platform == 'darwin'" },
{ name = "ninja" },
{ name = "pydantic" },
{ name = "sentencepiece" },