Compare commits

...

2862 Commits

Author SHA1 Message Date
James Hush
8375d299bc Revert 2025-11-28 13:53:12 +01:00
James Hush
98df964e68 fix: propagate skip_tts flag through LLM response frames
- Add skip_tts as an init parameter for TextFrame, LLMFullResponseStartFrame,
  and LLMFullResponseEndFrame instead of setting it post-init
- Update all LLM services to pass skip_tts when creating frames:
  - Anthropic, AWS (Bedrock, Nova Sonic, AgentCore), Google (Gemini, Gemini Live)
  - OpenAI (base, realtime), OpenAI Realtime Beta, SambaNova
- Add _get_skip_tts() helper method in LLMService base class
- Remove push_frame override that was setting skip_tts after frame creation
2025-11-28 13:40:09 +01:00
Aleix Conchillo Flaqué
b78eb5de6b Merge pull request #3148 from pipecat-ai/aleix/pipecat-0.0.96-update
update CHANGELOG for 0.0.96 with proper date
2025-11-26 17:21:31 -08:00
Aleix Conchillo Flaqué
95aa13beb1 update CHANGELOG for 0.0.96 with proper date 2025-11-26 17:16:54 -08:00
Mark Backman
88ce85342c Merge pull request #3147 from pipecat-ai/mb/fix-sagemaker-error-handling
Fix error handling in DeepramSageMakerSTTService
2025-11-26 20:15:45 -05:00
Mark Backman
bedd40ae8b Fix error handling in DeepramSageMakerSTTService 2025-11-26 20:12:31 -05:00
Mark Backman
fda327b3ee Merge pull request #3146 from pipecat-ai/mb/fix-aws-bedrock-region
fix: AWSBedrockLLMService was always set to us-east-1
2025-11-26 19:56:09 -05:00
Mark Backman
ace95b6e6d fix: AWSBedrockLLMService was always set to us-east-1 2025-11-26 19:52:04 -05:00
Aleix Conchillo Flaqué
26c5c28c5c Merge pull request #3145 from pipecat-ai/aleix/simli-enable-logging-param
SimliVideoService: add enable_logging input parameter
2025-11-26 16:49:12 -08:00
Aleix Conchillo Flaqué
81f862749d SimliVideoService: add enable_logging input parameter 2025-11-26 16:36:06 -08:00
Aleix Conchillo Flaqué
b8bf7b4132 Merge pull request #3143 from pipecat-ai/aleix/pipecat-0.0.96
update CHANGELOG for 0.0.96
2025-11-26 16:31:44 -08:00
Aleix Conchillo Flaqué
d90121ef3b update CHANGELOG for 0.0.96 2025-11-26 15:30:06 -08:00
Filipi da Silva Fuchter
d0b7b4fb0a Merge pull request #3144 from pipecat-ai/filipi/fix_flux_reconnection_issue
Fixed an issue with DeepgramFluxSTTService where it sometimes failed to reconnect.
2025-11-26 20:29:41 -03:00
Filipi Fuchter
4acc317923 Fixed an issue with DeepgramFluxSTTService where it sometimes failed to reconnect. 2025-11-26 20:23:03 -03:00
Filipi da Silva Fuchter
7caf5751ee Merge pull request #3084 from pipecat-ai/filipi/improve_error_handler
Improving error handler.
2025-11-26 18:40:44 -03:00
Filipi Fuchter
1330ef3ad6 Enhanced error handling across the framework.
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-11-26 18:34:25 -03:00
Mark Backman
9efb21d61e Merge pull request #3115 from pipecat-ai/mb/deepgram-websocket-tts
Update DeepgramTTSService to use Deepgram's Websocket TTS API
2025-11-26 13:30:52 -05:00
Mark Backman
6d93b8e9d8 Update DeepgramTTSService to use Deepgram's Websocket TTS API 2025-11-26 13:25:34 -05:00
Aleix Conchillo Flaqué
6f527e509e update CHANGELOG with FishAudioTTSService s1 model update 2025-11-26 10:22:59 -08:00
Aleix Conchillo Flaqué
6cf1d0417e Merge pull request #3136 from kcui5/patch-1
Update Fish Audio default model to s1
2025-11-26 10:19:26 -08:00
Mark Backman
19d8b0dfc2 Merge pull request #3011 from thsunkid/feat/add-cached-reasoning-tokens-metrics-to-opentel-spans 2025-11-26 07:45:33 -05:00
Kyle Cui
7fa0cbf2a9 Update Fish Audio default model to s1
Update default model from speech-1.5 to s1 for Fish Audio TTS service
2025-11-26 01:50:38 -08:00
Thu Nguyen
36c4bc2df2 Update changelog 2025-11-26 13:01:48 +07:00
Thu Nguyen
42be0183af Merge branch 'main' into feat/add-cached-reasoning-tokens-metrics-to-opentel-spans 2025-11-26 12:59:43 +07:00
Mark Backman
2607699664 Merge pull request #3125 from pipecat-ai/mb/fix-sagemaker-imports
fix: remove stt_sagemaker import from deepgram/__init__.py
2025-11-24 21:31:31 -05:00
Mark Backman
47fa3b8556 Merge pull request #3108 from fbarril/livekit-transport-helper
add livekit helper
2025-11-24 20:13:13 -05:00
Mark Backman
fa0100c38b fix: remove stt_sagemaker import from deepgram/__init__.py 2025-11-24 20:04:18 -05:00
kompfner
e5142c1210 Merge pull request #3113 from pipecat-ai/pk/agentcore-processor
Initial implementation of `AWSBedrockAgentCoreProcessor`
2025-11-24 19:10:44 -05:00
Paul Kompfner
5907b51c7d In AWSBedrockAgentCoreProcessor use self.create_task()/self.cancel_task() instead of using asyncio directly. 2025-11-24 18:53:39 -05:00
Paul Kompfner
9e4ec4f7f3 Implement AWSBedrockAgentCoreProcessor 2025-11-24 18:53:35 -05:00
fbarril
e2161ea63d add pyjwt as a livekit dependency 2025-11-24 23:30:11 +00:00
fbarril
7c81f66241 Merge remote-tracking branch 'origin/main' into livekit-transport-helper
# Conflicts:
#	CHANGELOG.md
#	uv.lock
2025-11-24 23:29:22 +00:00
fbarril
60da466379 add pyjwt as a livekit dependency 2025-11-24 23:27:32 +00:00
fbarril
12c29b71f3 add entry to CHANGELOG.md 2025-11-24 23:27:13 +00:00
Mark Backman
b52b108932 Merge pull request #3118 from pipecat-ai/mb/deepgram-stt-sagemaker
Add SageMaker BiDi client and DeepgramSageMakerSTTService
2025-11-24 16:47:25 -05:00
Mark Backman
a357ff0205 Alphabetize the project.optional-dependencies 2025-11-24 16:43:44 -05:00
Mark Backman
0ece8b5894 Add 07c Deepgram SageMaker example 2025-11-24 16:41:01 -05:00
Mark Backman
782b257bbb Add DeepgramSageMakerSTTService 2025-11-24 16:41:01 -05:00
Mark Backman
ab8dcd6ede Add SageMaker BiDi client 2025-11-24 16:41:00 -05:00
Mark Backman
012c2f7dde Merge pull request #3106 from pipecat-ai/mb/update-11labs-realtime-stt
Fix sample_rate issue in ElevenLabsRealtimeSTTService, add timestamps…
2025-11-24 08:10:30 -05:00
Mark Backman
87fdd8f006 Fix MiniMax changelog entries 2025-11-24 08:07:20 -05:00
Mark Backman
7bdac02837 Fix sample_rate issue in ElevenLabsRealtimeSTTService, add timestamps and logging 2025-11-24 08:06:33 -05:00
Mark Backman
861567bc59 Merge pull request #3119 from pipecat-ai/aleix/changelog-formatting
format CHANGELOG
2025-11-24 08:05:11 -05:00
Aleix Conchillo Flaqué
d0ff43134a format CHANGELOG 2025-11-23 17:48:57 -08:00
Dante Noguez
3458b74fc9 Fix 11labs realtime dynamic updates (#3117) 2025-11-22 10:02:37 -05:00
mattie ruth backman
a6202c4d1a Fixed CHANGELOG post rebase 2025-11-21 17:16:10 -05:00
mattie ruth backman
3c3141796a Overlooked Changelog updates 2025-11-21 17:16:10 -05:00
mattie ruth backman
8b8b57b09c Introduced new bot-output RTVI event to provide...
a best effort version of the bot's output

- The `RTVIObserver` now emits `bot-output` messages based off
  the new `AggregatedTextFrame`s (`bot-tts-text` and
  `bot-llm-text` are still supported and generated, but
  `bot-transcript` is now deprecated in lieu of this new, more
  thorough, message).
- The new `RTVIBotOutputMessage` includes the fields:
  - `spoken`: A boolean indicating whether the text was spoken by TTS
  - `aggregated_by`: A string representing how the text was aggregated
    ("sentence", "word", "my custom aggregation")
- Introduced new fields to `RTVIObserver` to support the new
  `bot-output` messaging:
  - `bot_output_enabled`: Defaults to True. Set to false to disable
    bot-output messages.
  - `skip_aggregator_types`: Defaults to `None`. Set to a list of
    strings that match aggregation types that should not be included
    in bot-output messages. (Ex. `credit_card`)
2025-11-21 17:16:10 -05:00
mattie ruth backman
4f30a48ecd Rime and Cartesia TTS Updates:
`CartesiaTTSService`:
 - Modified use of custom default text_aggregator to avoid deprecation warnings and push users
   towards use of transformers or the `LLMTextProcessor`
 - Added convenience methods for taking advantage of Cartesia's SSML tags: spell, emotion,
   pauses, volume, and speed.

`RimeTTSService`:
 - Modified use of custom default text_aggregator to avoid deprecation warnings and push users
   towards use of transformers or the `LLMTextProcessor`
 - Added convenience methods for taking advantage of Rime's customization options: spell,
   pauses, pronunciations, and inline speed control.
2025-11-21 17:16:10 -05:00
mattie ruth backman
ecbc41045c Added ability to transform text just-in-time before it gets sent to the TTS 2025-11-21 17:16:10 -05:00
mattie ruth backman
e1528d0f0c Added support to TTS services to skip sending text to the...
the actual TTS service to be spoken based on its aggregation type.
2025-11-21 17:16:10 -05:00
mattie ruth backman
6b6d760cf1 Introduced LLMTextProcessor and deprecatd custom text_aggregators in TTS
Introduced `LLMTextProcessor`: A new processor meant to allow customization for how
LLMTextFrames should be aggregated and considered. It's purpose is to turn
`LLMTextFrame`s into `AggregatedTextFrame`s. By default, a TTSService will still
aggregate `LLMTextFrame`s by sentence for the service to consume. However, if you
wish to override how the llm text is aggregated, you should no longer override the
TTS's internal text_aggregator, but instead, insert this processor between your LLM
and TTS in the pipeline.
2025-11-21 17:16:10 -05:00
mattie ruth backman
7a4372a909 Introduced a new AggregatedTextFrame Frame type that TTSTextFrame inherits from
This frame introduces an `aggregated_by` field to describe the type of text included
in the frame and allows unspoken groupings of text to be pushed through the pipeline
and treated similar to TTSTextFrames.
2025-11-21 17:16:10 -05:00
mattie ruth backman
0e820a01b9 Introduce append_to_context to TextFrames
Adding support for setting whether or not the text in the TextFrame
should be added to the LLM context (by the LLM assistant aggregator).
Defaults to `True`.
2025-11-21 17:16:10 -05:00
mattie ruth backman
24266c238f Augmented PatternPairAggregator so that matched patterns can...
be treated as their own aggregation, taking advantage of the new
ability to assign a type to an aggregation
2025-11-21 17:16:10 -05:00
mattie ruth backman
dcc20f86e1 Updated the BaseTextAggregator to categorize aggregations
Modified the BaseTextAggregator type so that when text gets aggregated, metadata can
be associated with it. Currently, that just means a `type`, so that the aggregation
can be classified or described. Changes made to support this:
  - **IMPORTANT**: Aggregators are now expected to strip leading/trailing white space
    characters before returning their aggregation from `aggregation()` or `.text`. This
    way all aggregators have a consistent contract allowing downstream use to know how
    to stitch aggregations back together
  - Introduced a new `Aggregation` dataclass to represent both the aggregated `text` and
    a string identifying the `type` of aggregation (ex. "sentence", "word", "my custom
    aggregation")
  - **BREAKING**: `BaseTextAggregator.text` now returns an `Aggregation` (instead of `str`).
    To update: `aggregated_text = myAggregator.text` -> `aggregated_text = myAggregator.text.text`
  - **BREAKING**: `BaseTextAggregator.aggregate()` now returns `Optional[Aggregation]`
    (instead of `Optional[str]`). To update:
      ```
      aggregation = myAggregator.aggregate(text)
      if (aggregation):
        print(f"successfully aggregated text: {aggregation.text}") // instead of {aggregation}
      ```
  - `SimpleTextAggregator`, `SkipTagsAggregator`, `PatternPairAggregator` updated to
     produce/consume `Aggregation` objects.
  - All uses of the above Aggregators have been updated accordingly.
2025-11-21 17:16:10 -05:00
fbarril
ec8964425a add livekit helper 2025-11-21 00:27:57 +00:00
Vanessa Pyne
26918728df Merge pull request #3096 from pipecat-ai/vp-minimax-2962-v2
minimax 2962 language updates
2025-11-20 10:41:35 -06:00
vipyne
954849379b cleanup 2025-11-20 10:41:09 -06:00
vipyne
06542a2dbc Update CHANGELOG 2025-11-20 10:41:09 -06:00
Vanessa Pyne
59d40eac45 Update src/pipecat/services/minimax/tts.py
Co-authored-by: Mark Backman <mark@daily.co>

add warning
2025-11-20 10:41:09 -06:00
vipyne
17cf6c56cf minimax updates
some `debug`s -> `trace`s

add western US base_url to docs

ensure error_message is defined

add deprecation warning for `english_normalization` param
2025-11-20 10:41:09 -06:00
minimax
616e6ba351 docs(minimax): add API endpoint comment for west US region 2025-11-20 10:41:08 -06:00
minimax
f3cb5e0106 feat(minimax): comprehensive updates to TTS service
- Add support for speech-2.6-hd and speech-2.6-turbo models
- Add 16 new languages (total 40): Afrikaans, Bulgarian, Catalan, Danish, Persian, Filipino, Hebrew, Croatian, Hungarian, Malay, Norwegian, Nynorsk, Slovak, Slovenian, Swedish, Tamil
- Add new emotions: calm and fluent
- Add new parameters: text_normalization (renamed from english_normalization), latex_read, force_cbr, exclude_aggregated_audio, subtitle_enable, subtitle_type
- Extract trace_id from response headers for all requests
- Improve error handling for non-streaming error responses
- Add detailed extra_info logging (audio_length, audio_size, usage_characters, word_count)
- Add validation warnings for language/model compatibility
- Fix silent error issue where HTTP 200 responses with errors were ignored

BREAKING CHANGE: Renamed parameter english_normalization to text_normalization
2025-11-20 10:41:08 -06:00
Aleix Conchillo Flaqué
c89f230c99 fix CHANGELOG 2025-11-20 08:40:30 -08:00
Aleix Conchillo Flaqué
69cd5716cd Merge pull request #3102 from pipecat-ai/aleix/daily-python-0.22.0
pyproject: update daily-python to 0.22.0
2025-11-20 08:35:39 -08:00
Mark Backman
ab58f72322 Merge pull request #3101 from hwuiwon/hw/inworld-talking-speed
feat: Add speaking rate control to Inworld TTS service.
2025-11-20 09:50:55 -05:00
Hwuiwon Kim
ead361f665 fix 2025-11-20 07:45:13 -05:00
Aleix Conchillo Flaqué
fa6b8851ed pyproject: update daily-python to 0.22.0 2025-11-19 21:56:38 -08:00
Hwuiwon Kim
1cc69d475d feat: Add speaking rate control to Inworld TTS service & fix param cases 2025-11-19 22:57:53 -05:00
Mark Backman
51bdd8b728 Merge pull request #3097 from hwuiwon/fix-typo
Fix typo in STT event handler documentation
2025-11-19 17:10:32 -05:00
Hwuiwon Kim
30ff488714 Fix typo in event handler documentation 2025-11-19 17:04:07 -05:00
Vanessa Pyne
510f3df6b7 Merge pull request #3091 from pipecat-ai/vp-fix-mcp-examples
update MCP foundational examples
2025-11-19 10:35:08 -06:00
vipyne
68292bd75f rename MCP foundational examples 2025-11-19 10:34:13 -06:00
vipyne
42423bff41 update MCP foundational examples 2025-11-19 10:29:18 -06:00
Aleix Conchillo Flaqué
c3d2a25229 Merge pull request #3082 from pipecat-ai/aleix/pipecat-0.0.95
update CHANGELOG for 0.0.95
2025-11-18 21:17:07 -08:00
Aleix Conchillo Flaqué
cf1a9c1548 update CHANGELOG for 0.0.95 2025-11-18 21:14:27 -08:00
Aleix Conchillo Flaqué
51ba245e10 scripts(evals): fix EVAL_CONVERSATION/EVAL_WEATHER eval 2025-11-18 21:14:27 -08:00
Aleix Conchillo Flaqué
39b4e61837 SimliVideoService: fix connection issue 2025-11-18 19:41:47 -08:00
Aleix Conchillo Flaqué
ceaf53fdb0 LLMContext: async create_image_message/create_audio_message fixes 2025-11-18 19:41:13 -08:00
Aleix Conchillo Flaqué
f93276c64f Merge pull request #3090 from pipecat-ai/revert_function_calling_pr
Reverting: Ensure that the function call results respect the previous LLM context
2025-11-18 19:40:58 -08:00
Mark Backman
62a0f0c0f5 Merge pull request #3070 from ivaaan/hume-timestamps 2025-11-18 19:56:20 -05:00
Filipi Fuchter
793aca6b8b Revert "Ensure that the function call results respect the previous LLM context."
This reverts commit a510b276e6.
2025-11-18 21:38:49 -03:00
Filipi Fuchter
1fcaf3a4bf Revert "Searching in both _function_calls_context_messages and context messages when updating the result."
This reverts commit fccc91e923.
2025-11-18 21:38:49 -03:00
ivaaan
6484855139 fix changelog 2025-11-18 21:47:46 +01:00
ivaaan
771469b834 fix changelog 2025-11-18 21:39:29 +01:00
kompfner
a60618b0ca Merge pull request #3080 from pipecat-ai/pk/assistant-aggregator-handles-mixed-includes-inter-frame-spaces-text
`LLMAssistantAggregator` now properly aggregates text that might be a…
2025-11-18 15:24:27 -05:00
Paul Kompfner
3d21faaac2 LLMAssistantAggregator now properly aggregates text that might be a mix of includes_inter_frame_spaces=True and includes_inter_frame_spaces=False frames 2025-11-18 15:12:25 -05:00
ivaaan
f325eeb95b rm TranscriptProcessor 2 2025-11-18 20:41:10 +01:00
ivaaan
4c3fd42b1c fix changelog 2025-11-18 20:36:45 +01:00
ivaaan
c2309efd7e rm TranscriptProcessor 2025-11-18 20:35:09 +01:00
Ivan A
4ae1819645 Update src/pipecat/services/hume/tts.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-11-18 20:30:44 +01:00
Ivan A
a38f208135 Update examples/foundational/07ae-interruptible-hume.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-11-18 20:30:28 +01:00
Mark Backman
d1eb837890 Merge pull request #3081 from pipecat-ai/mb/fix-30-tts-text-frame-log
Fix foundational 30 example to output TTSTextFrames synced to audio
2025-11-18 14:10:56 -05:00
Mark Backman
153201542b Fix foundational 30 example to output TTSTextFrames synced to audio 2025-11-18 13:29:06 -05:00
Filipi da Silva Fuchter
9137e50043 Merge pull request #3053 from pipecat-ai/filipi/function_calls
Ensure that the function call results respect the previous LLM context.
2025-11-18 14:59:01 -03:00
Ivan A
8dbe119a73 Merge branch 'main' into hume-timestamps 2025-11-18 18:38:24 +01:00
ivaaan
26f96d0be8 upd example 2025-11-18 18:31:38 +01:00
ivaaan
9944e6faf0 upd service based on Mark's suggestions 2025-11-18 18:25:53 +01:00
Aleix Conchillo Flaqué
c1573c1f76 Merge pull request #3078 from pipecat-ai/aleix/llm-context-create-image-audio-async
LLMContext: create_image_message/create_audio_message are now async
2025-11-18 09:06:51 -08:00
Aleix Conchillo Flaqué
9f45ad4d2e LLMContext: create_image_message/create_audio_message are now async 2025-11-18 09:04:40 -08:00
Filipi Fuchter
fccc91e923 Searching in both _function_calls_context_messages and context messages when updating the result. 2025-11-18 11:50:28 -03:00
Filipi Fuchter
a510b276e6 Ensure that the function call results respect the previous LLM context. 2025-11-18 11:37:57 -03:00
Mark Backman
6481094638 Merge pull request #3058 from pipecat-ai/mb/add-camera-screen-support-smallwebrtc
Add camera and screen capture support to dev runner for SmallWebRTC
2025-11-18 09:22:36 -05:00
Mark Backman
3132e12265 Add camera and screen capture support to dev runner for SmallWebRTC 2025-11-18 09:19:13 -05:00
Aleix Conchillo Flaqué
12af3f79d0 Merge pull request #3060 from pipecat-ai/aleix/consumer-queue-frames
ConsumerProcessor: queue frames internally instead of pushing them
2025-11-18 00:54:18 -08:00
Aleix Conchillo Flaqué
4835617b16 ConsumerProcessor: queue frames internally instead of pushing them 2025-11-17 23:52:09 -08:00
Aleix Conchillo Flaqué
9283108240 Merge pull request #3073 from pipecat-ai/aleix/base-text-filter-only-filter
BaseTextFilter: only require subclasses to implement filter()
2025-11-17 23:29:26 -08:00
kompfner
515eaeeb1a Merge pull request #3074 from pipecat-ai/pk/tweak-moondream-example
Update Moondream example so that Moondream service output makes it in…
2025-11-17 16:52:18 -05:00
Paul Kompfner
5095fc6a64 Update Moondream example so that Moondream service output makes it into the context, even if the TTS service is disabled 2025-11-17 15:16:19 -05:00
Aleix Conchillo Flaqué
7eedb33d50 BaseTextFilter: only require subclasses to implement filter() 2025-11-17 11:23:47 -08:00
Filipi da Silva Fuchter
47f78df497 Merge pull request #3071 from pipecat-ai/filipi/small_webrtc_custom_data
Passing the custom request_data to the SmallWebRTCRunnerArguments body.
2025-11-17 15:50:11 -03:00
Filipi Fuchter
74154b26a2 Mentioning the SmallWebRTCTransport fix in the readme. 2025-11-17 15:39:07 -03:00
Filipi Fuchter
0c3c26b7b8 Passing the custom request_data to the SmallWebRTCRunnerArguments body. 2025-11-17 15:20:09 -03:00
kompfner
64417ef4ff Merge pull request #3061 from pipecat-ai/pk/greatly-simplify-inter-frame-spaces-logic
D'oh! My TTS "inter-frame-spaces" logic was *way* overcomplicated (an…
2025-11-17 10:47:56 -05:00
Paul Kompfner
f3b254e335 D'oh! My TTS "inter-frame-spaces" logic was *way* overcomplicated (and fundamentally mistaken, though it happened to work)
Now:
- For TTS word-by-word output and `TTSSpeakFrames`: `TTSTextFrame`s' have `includes_inter_frame_spaces=False`.
- For all other TTS output: `TTSTextFrame` pass through the received text frames' `includes_inter_frame_spaces` value. So far, this value has always been `True`: LLMs send text chunks already containing all necessary spaces.
- `LLMTextFrame`s set `includes_inter_frame_spaces=False` at init time, per the aforementioned assumption.
2025-11-17 10:14:28 -05:00
Filipi da Silva Fuchter
f27119a712 Merge pull request #3069 from pipecat-ai/filipi/fix_riva
Fixing RivaTTSService error handler.
2025-11-17 11:48:15 -03:00
ivaaan
2a51d0f1e5 add changelog 2025-11-17 15:20:06 +01:00
ivaaan
9156e21727 fix formatting 2025-11-17 14:00:03 +01:00
Filipi da Silva Fuchter
a5145be16e Merge pull request #3038 from pipecat-ai/filipi/flux_improvements
Deepgram Flux improvements
2025-11-17 09:57:43 -03:00
Filipi Fuchter
b104a59b10 Mentioning the Deepgram Flux improvements in the changelog. 2025-11-17 09:54:39 -03:00
Filipi Fuchter
04dbbabc03 Introduced a minimum confidence parameter in DeepgramFluxSTTService to avoid generating transcriptions below a defined threshold. 2025-11-17 09:54:30 -03:00
Filipi Fuchter
19cc0177b8 Refactored DeepgramFluxSTTService to automatically reconnect if sending a message fails. 2025-11-17 09:54:20 -03:00
Filipi Fuchter
77cd106795 Extracted the logic for retrying connections, and create a new send_with_retry method inside WebSocketService. 2025-11-17 09:54:08 -03:00
ivaaan
71869a116d fix errors 2025-11-17 13:51:04 +01:00
ivaaan
2f2bde9856 add timestamps to example 2025-11-17 13:40:03 +01:00
ivaaan
7de8838deb add word-level timestamp support to Hume service 2025-11-17 13:25:12 +01:00
Filipi Fuchter
9bf88bbf14 Fixing RivaTTSService error handler. 2025-11-17 07:43:30 -03:00
Mark Backman
35ff44b799 Merge pull request #3059 from pipecat-ai/mb/remove-llm-tracing-fallback 2025-11-14 14:07:40 -05:00
Angad Singh
d1116d149e feat: Add ErrorFrame emission to TTS/STT services for pipeline error detection (#2881)
* feat: Add ErrorFrame emission to TTS/STT services for pipeline error detection

- Add ErrorFrame emission to all major TTS/STT services during initialization and runtime failures
- Services updated: Cartesia, ElevenLabs, Deepgram, AssemblyAI, Rime, Azure
- ErrorFrame objects emitted with fatal=False for graceful degradation
- Enables on_pipeline_error event handler to detect service failures programmatically
- Add comprehensive pytest test suite to verify ErrorFrame emission
- Fixes issue where services failed gracefully but didn't emit ErrorFrame objects

This allows developers to implement real-time error monitoring and alerting
using the on_pipeline_error event handler introduced in v0.0.90.

* Update STT and TTS services to use consistent error handling pattern

- Improves error handling consistency across all services

* Add changelog entry for STT/TTS error handling improvements

* Linting issues Resolved

* Azure STT ErrorFrames added with consistent patterns

* Cartesia STT and Deepgram STT; additional fixes made

* Removed Fatal Flags across services, removed duplication

* Moving the changelog entry to the correct place.

* Refactoring some classes to use yield instead of push_error directly.

* Fixing ruff format.

---------

Co-authored-by: Filipi Fuchter <filipi87@gmail.com>
2025-11-14 15:03:05 -03:00
Mark Backman
d01876ee60 Remove fallbacks in traced_llm 2025-11-14 12:13:49 -05:00
Mark Backman
74a0e8c88d Merge pull request #3050 from ai-coustics/aic-vad-analyzer
feat(ai-coustics): add ai-coustics integrated VAD
2025-11-14 08:11:15 -05:00
Corvin Jaedicke
fbbad27d37 add changelog info 2025-11-14 13:30:06 +01:00
kompfner
e83ac82bf3 Merge pull request #3042 from pipecat-ai/pk/follow-up-inter-frame-spaces
Follow-up to #3041
2025-11-13 11:03:06 -05:00
Mark Backman
d78d38ce44 Merge pull request #3039 from pipecat-ai/mb/update-google-gemini-tts
Update GeminiTTSService for streaming, other Google TTS improvements
2025-11-13 10:33:46 -05:00
Mark Backman
edbf96b3c5 Update GeminiTTSService for streaming, other Google TTS improvements 2025-11-13 10:22:34 -05:00
Paul Kompfner
8851d18f92 Tweak the LLM prompt again to try to fix the issue of LLMs sometimes omitting punctuation in their output. 2025-11-13 10:02:33 -05:00
Mark Backman
d823a3edec Merge pull request #3040 from pipecat-ai/mb/11labs-realtime-stt
Add ElevenLabsRealtimeSTTService
2025-11-13 09:53:34 -05:00
Mark Backman
0e37658f8d Add ElevenLabsRealtimeSTTService 2025-11-13 09:49:05 -05:00
Corvin Jaedicke
2fab3e2286 fix formatting 2025-11-13 14:39:26 +01:00
Corvin Jaedicke
a7b2052b38 add ai-coustics VAD 2025-11-13 14:20:35 +01:00
Mark Backman
6d0e99c3b8 Merge pull request #3044 from rimelabs/rime-hin-lanaguge-support
Add support for Hindi language in RIme TTS service
2025-11-12 21:13:01 -05:00
gokuljs
fe25465987 changelog update 2025-11-13 07:16:36 +05:30
gokuljs
498e9ca4f6 Add support for Hindi language in RIme TTS service 2025-11-13 04:33:22 +05:30
Paul Kompfner
1802f949ef Fix an issue with some examples where punctuation was missing from the LLM output, by tweaking the LLM prompt. 2025-11-12 17:12:03 -05:00
Paul Kompfner
1ad6405ebb Override includes_inter_frame_spaces in:
- `GoogleHttpTTSService`
- `OpenAITTSService`

The reason I skipped this work in an earlier PR was because these services seemed to be emitting long, punctuation-free text frames. It turns out that the issue was with the LLM prompt, though, resulting in the LLM nondeterministically excluding all punctuation. An upcoming commit will address that prompt issue.
2025-11-12 17:07:43 -05:00
kompfner
4c25555396 Merge pull request #3041 from pipecat-ai/pk/apply-includes-inter-frame-spaces-wherever-necessary
Apply `includes_inter_frame_spaces = True` in all LLM and TTS service…
2025-11-12 16:49:14 -05:00
Paul Kompfner
5222ff99de Apply includes_inter_frame_spaces = True in all LLM and TTS services that need it.
Note that for `LLMTextFrame`s, the right behavior is pretty much always `includes_inter_frame_spaces = True`. I decided *not* to go ahead and make that the default for `LLMTextFrame`s, though, simply to not introduce a subtle behavior change for creative/unexpected use-cases that were relying on text in hand-crafted `LLMTextFrame`s being handled a certain way. Ditto for `TTSTextFrame`s.

Also, fix an issue in `NeuphonicTTSService` where it wasn't pushing `TTSTextFrame`s.

Also, fix the broken `SarvamHttpTTSService` example.

Also, add a couple of missing examples.
2025-11-12 15:10:11 -05:00
Mark Backman
203a627707 Merge pull request #3028 from sam-s10s/fix/smx-tts-retry
SpeechmaticsTTS - Support for retry when 503 error to TTS API.
2025-11-12 09:26:07 -05:00
James Hush
2006a64def Fix Langfuse tracing for GoogleLLMService with universal LLMContext (#3025)
* Fix Langfuse tracing for GoogleLLMService with universal LLMContext

- Fixed issue where input appeared as null in Langfuse dashboard for GoogleLLMService
- Added fallback to use adapter's get_messages_for_logging() for universal LLMContext
- Ensures proper message format conversion for Google/Gemini services
- Handles system message conversion to system_instruction format
- Also fixes serialization of empty message lists ([] now serializes correctly)

This fix ensures Langfuse tracing works correctly for Google services using
both OpenAILLMContext/GoogleLLMContext and the universal LLMContext.

* Add unit tests for Langfuse tracing with GoogleLLMService

- Test that tracing correctly captures messages with universal LLMContext
- Test that empty message lists are properly serialized
- Test that adapter's get_messages_for_logging is used instead of context method
- All tests verify that input is correctly added to Langfuse spans

* Fix test mocking to patch opentelemetry.trace.get_tracer correctly

The tests were failing in CI because they were trying to patch
'pipecat.utils.tracing.service_decorators.trace' which doesn't exist as
an attribute. The trace module is imported from opentelemetry, so we need
to patch 'opentelemetry.trace.get_tracer' instead.

* Skip tracing tests when opentelemetry is not installed

The tracing dependencies (opentelemetry) are optional in Pipecat and not
installed in the CI environment. Added a skipif marker to skip these tests
when opentelemetry is not available, preventing CI failures while still
allowing the tests to run when tracing dependencies are installed locally.

* Install tracing dependencies in GitHub Actions CI

Instead of skipping the tracing tests, install the 'tracing' extra
(opentelemetry) in the CI environment so the tests can run properly.
Removed the skipif condition from the tests since opentelemetry will
now be available in CI.

* Use the context type to determine which messages to use, fix tool_count and tools (#3032)

---------

Co-authored-by: Mark Backman <mark@daily.co>
2025-11-12 14:58:00 +01:00
Corvin Jaedicke
3c76917c1e use async process function 2025-11-12 13:48:22 +01:00
Filipi da Silva Fuchter
eb36a1bc91 Merge pull request #3033 from pipecat-ai/filipi/deepgram_flux_urlencode_changelog
Mentioning DeepgramFluxSTTService URL encode fix in changelog.
2025-11-11 17:29:07 -03:00
Filipi Fuchter
fff8aac18c Mention DeepgramFluxSTTService URL encode fix in changelog. 2025-11-11 17:25:40 -03:00
Filipi da Silva Fuchter
ec4bd8db10 Merge pull request #3014 from julienvantyghem/fix/deepgramflux-keyterm-encoding
fix(deepgram-flux): urlencode keyterm and tag parameters
2025-11-11 17:24:17 -03:00
Filipi da Silva Fuchter
4cc298d616 Merge pull request #3029 from pipecat-ai/filipi/heygen_keep_alive
Preventing HeyGenVideoService from disconnecting.
2025-11-11 15:25:43 -03:00
Sam Sykes
8d21b54ef3 Revert to ErrorFrame. 2025-11-11 18:24:08 +00:00
Sam Sykes
217d7e9953 Fix for max attempts. 2025-11-11 18:05:06 +00:00
Sam Sykes
41cf9adef4 Updated for max retries. 2025-11-11 18:00:27 +00:00
Sam Sykes
501744d7da Update CHANGELOG. 2025-11-11 17:53:31 +00:00
Sam Sykes
60bc77c795 Update debugging messages. 2025-11-11 17:50:06 +00:00
Sam Sykes
0febfc62ec Updated to use backoff utility function. 2025-11-11 17:45:22 +00:00
Filipi Fuchter
b76b25a6e1 Mentioning the HeyGen fix in the changelog. 2025-11-11 11:58:31 -03:00
Filipi Fuchter
62caadfc7c Preventing HeyGenVideoService from disconnecting. 2025-11-11 11:37:46 -03:00
Sam Sykes
41ac43cf71 updated docs 2025-11-11 13:56:45 +00:00
Sam Sykes
adf5198423 Support for retry when 503 error to TTS API. 2025-11-11 13:49:14 +00:00
Mark Backman
54e8d29615 Merge pull request #3022 from pipecat-ai/mb/changelog-0.0.94
Prep for 0.0.94 hotfix
2025-11-10 16:52:38 -05:00
Mark Backman
ee494918a9 Prep for 0.0.94 hotfix 2025-11-10 16:50:58 -05:00
Mark Backman
aa8a50bc61 Merge pull request #3015 from pipecat-ai/mb/deprecate-krisp
Deprecate KrispFilter
2025-11-10 16:38:06 -05:00
Mark Backman
20857ac19a Merge pull request #3010 from pipecat-ai/mb/gemini-live-ar-xa
Add ar-XA language code for Gemini Live
2025-11-10 16:36:33 -05:00
Mark Backman
421a1b5389 Merge pull request #3021 from pipecat-ai/mb/add-sarvam-stt-readme
Add Sarvam STT to README list
2025-11-10 16:36:03 -05:00
Mark Backman
8dd45af5b7 Deprecate KrispFilter 2025-11-10 16:35:11 -05:00
kompfner
66c903276a Merge pull request #3020 from pipecat-ai/pk/make-explicit-adding-spaces-when-concatenating-tts-text
Make the mechanism of adding spaces when concatenating TTS (or speech…
2025-11-10 14:34:10 -05:00
Mark Backman
588dcf2ab9 Add Sarvam STT to README list 2025-11-10 14:29:54 -05:00
Paul Kompfner
913194844e Make the mechanism of adding spaces when concatenating TTS (or speech-to-speech LLM) output text explicit and deterministic, rather than heuristic-based.
This fixes a bug where spaces were sometimes missing from assistant messages in context.
2025-11-10 14:22:32 -05:00
Vanessa Pyne
c2ce143e6c Merge pull request #3017 from pipecat-ai/vp-rm-livekit-serializer
remove LivekitFrameSerializer
2025-11-10 11:56:47 -06:00
vipyne
c1c7a561ed remove LivekitFrameSerializer 2025-11-10 11:06:12 -06:00
kompfner
05311dcfbf Merge pull request #3016 from pipecat-ai/mb/revert-concat-aggregated-text
Revert "Merge pull request #3004 from pipecat-ai/mb/improve-concat-ag…
2025-11-10 10:49:38 -05:00
Mark Backman
2300941bb8 Revert "Merge pull request #3004 from pipecat-ai/mb/improve-concat-aggregated-text"
This reverts commit 5e7f59a0b0, reversing
changes made to 2ad4122b77.
2025-11-10 09:58:19 -05:00
Julien Vantyghem
c38055dbdd fix(deepgram-flux): urlencode keyterm and tag parameters 2025-11-09 19:17:19 +01:00
Thu Nguyen
35593b8574 Add cached and reasoning token metrics to OpenTelemetry spans 2025-11-09 00:38:30 +07:00
Mark Backman
0df75b0915 Add ar-XA language code for Gemini Live 2025-11-08 08:24:55 -05:00
Aleix Conchillo Flaqué
16e2d5b998 Merge pull request #3007 from pipecat-ai/aleix/pipecat-0.0.93
update CHANGELOG for 0.0.93
2025-11-07 13:25:25 -08:00
Aleix Conchillo Flaqué
4cf9e1409e update CHANGELOG for 0.0.93 2025-11-07 13:17:44 -08:00
Aleix Conchillo Flaqué
0ed430e7e2 examples(foundational): use DeepgramSTTService in 07 2025-11-07 11:34:11 -08:00
Aleix Conchillo Flaqué
342a8b121b pyproject: update simli to 0.1.25 2025-11-07 11:30:41 -08:00
Aleix Conchillo Flaqué
5729722dcd SimliVideoService: check exception initializing simli client 2025-11-07 11:30:41 -08:00
Aleix Conchillo Flaqué
38aac44a1e scripts(evals): 26c should be a camera eval 2025-11-07 11:30:41 -08:00
Aleix Conchillo Flaqué
4f1468e0fa scripts(evals): improve eval prompt 2025-11-07 10:05:46 -08:00
Aleix Conchillo Flaqué
9b1192ca9b Merge pull request #3001 from pipecat-ai/pk/openai-realtime-toolsschema-support
Added support for passing in a `ToolsSchema` in lieu of a list of pro…
2025-11-07 09:37:43 -08:00
Mark Backman
5e7f59a0b0 Merge pull request #3004 from pipecat-ai/mb/improve-concat-aggregated-text
Improve concatenate_aggregated_text string utility
2025-11-07 12:37:12 -05:00
Aleix Conchillo Flaqué
2ad4122b77 Merge pull request #3006 from pipecat-ai/aleix/vision-image-backwards-compatibility
restore vision/image backwards compatibility
2025-11-07 09:19:38 -08:00
Mark Backman
5950f734f5 Merge pull request #3002 from pipecat-ai/mb/clarify-model-openai-realtime
Clarify how to set model in OpenAIRealtimeLLMService
2025-11-07 12:05:10 -05:00
Aleix Conchillo Flaqué
8d0364b630 restore vision/image backwards compatibility 2025-11-07 08:53:58 -08:00
kompfner
bfe031604a Merge pull request #3005 from pipecat-ai/pk/add-missing-comments
Add missing explanatory comments to AWS and Anthropic that are presen…
2025-11-07 11:50:41 -05:00
kompfner
9bfde61183 Merge pull request #3003 from pipecat-ai/pk/fix-deprecation-warning-always-printed-on-set-bot-ready
Fix a deprecation warning printed every time `RTVIProcessor.set_bot_r…
2025-11-07 11:50:30 -05:00
Paul Kompfner
cb40a39a01 Add missing explanatory comments to AWS and Anthropic that are present in the other LLM services 2025-11-07 11:44:44 -05:00
Mark Backman
03001f8047 Update TranscriptProcessor unit tests 2025-11-07 11:44:04 -05:00
Paul Kompfner
10f1c314b6 Fix a deprecation warning printed every time RTVIProcessor.set_bot_ready() is called 2025-11-07 11:27:58 -05:00
Mark Backman
4d1d6465fc Support OpenAI Realtime and Gemini Live single word edge cases in concatenate_aggregated_text 2025-11-07 11:26:38 -05:00
Paul Kompfner
359d220162 Document a OpenAIRealtimeLLMService gotcha in an example. 2025-11-07 10:32:27 -05:00
Mark Backman
6feecf05f7 Merge pull request #2994 from Toprak2/patch-1
Fix incorrect docstring in FrameProcessorQueue.__init__
2025-11-07 10:21:11 -05:00
Paul Kompfner
c3306bb4f2 Support for passing in a ToolsSchema in lieu of a list of provider-specific dicts when updating OpenAIRealtimeLLMService using LLMUpdateSettingsFrame. 2025-11-07 10:18:29 -05:00
Mark Backman
07a4aae248 Clarify how to set model in OpenAIRealtimeLLMService 2025-11-07 09:58:12 -05:00
Paul Kompfner
925a6cc2ef Added support for passing in a ToolsSchema in lieu of a list of provider-specific dicts when initializing OpenAIRealtimeLLMService.
I chose to go the somewhat hacky route of adding the `ToolsSchema` support into the `events.SessionProperties` model itself—even though we should never serialize that type when creating events—because the alternative seemed to be to create a new type for `OpenAIRealtimeLLMService` initialization parameters and then we'd have to contend with backward compatibility, which seemed like a bigger headache.
2025-11-07 09:50:26 -05:00
Mark Backman
613ad74103 Merge pull request #3000 from pipecat-ai/mb/clarify-openai-realtime-model-docs 2025-11-07 06:36:30 -05:00
Muhammed Ali Toprak
2ab6b71890 Shorten docstring for clarity 2025-11-07 11:24:06 +03:00
Toprak2
c2bd8d22a0 Merge branch 'pipecat-ai:main' into patch-1 2025-11-07 11:19:08 +03:00
Mark Backman
eda12f56e6 Add clarifying documentation about OpenAI Realtime model use 2025-11-06 19:42:35 -05:00
Aleix Conchillo Flaqué
3daa1b7850 Merge pull request #2998 from pipecat-ai/aleix/transport-params-audio-out-end-silence-secs
BaseOutputTransport: send silence when EndFrame is received
2025-11-06 12:18:33 -08:00
Aleix Conchillo Flaqué
4c8c44ecc3 BaseOutputTransport: send silence when EndFrame is received 2025-11-06 12:16:05 -08:00
Aleix Conchillo Flaqué
8c34e1efba Merge pull request #2996 from pipecat-ai/aleix/broadcast-frame
FrameProcessor: add new broadcast_frame() method
2025-11-06 12:13:15 -08:00
Aleix Conchillo Flaqué
f6916428b1 FrameProcessor: add new broadcast_frame() method 2025-11-06 12:11:48 -08:00
Marcus
a14d00b806 Improved LocalSmartTurnAnalyzerV3 performance on systems with a low CPU count (#2982) 2025-11-06 14:42:05 -05:00
Mark Backman
927cf751c0 Merge pull request #2997 from pipecat-ai/mb/google-stt-409
GoogleSTTService: Add more robust handling of 409 errors
2025-11-06 14:39:51 -05:00
Mark Backman
1fb6d6bd23 GoogleSTTService: Add more robust handling of 409 errors 2025-11-06 14:35:53 -05:00
Mark Backman
94a3306679 Merge pull request #2995 from pipecat-ai/mb/fix-stt-mute-filter-stt-muting
fix: STTMuteFilter no longer sends STTMuteFrame
2025-11-06 14:35:07 -05:00
Mark Backman
16bd1fe32d Merge pull request #2984 from pipecat-ai/mb/11labs-pronunciation-dictionary
Add ElevenLabs pronunciation dictionary support
2025-11-06 14:23:49 -05:00
Mark Backman
58b552171d Add pronunciation_dictionary_locators to ElevenLabs TTS Services 2025-11-06 14:13:51 -05:00
Mark Backman
4732a442d4 Merge pull request #2992 from pipecat-ai/mb/metrics-log-observer
Add MetricsLogObserver
2025-11-06 14:04:55 -05:00
Mark Backman
accdddce95 Add MetricsLogObserver 2025-11-06 14:01:20 -05:00
Mark Backman
daf9da823c Merge pull request #2993 from pipecat-ai/mb/fix-gemini-token-counting
fix: correct GoogleLLMService token counting
2025-11-06 13:47:51 -05:00
Mark Backman
f6b6aa8766 fix: STTMuteFilter no longer sends STTMuteFrame 2025-11-06 11:53:32 -05:00
Toprak2
935eb58951 Update docstring for FrameProcessorQueue
Clarify the docstring for FrameProcessorQueue initialization.
2025-11-06 19:18:15 +03:00
Mark Backman
9f2ddcc5f4 Merge pull request #2927 from pipecat-ai/marcus/2025-10-28_sample_rtvi_fix
Add RTVIProcessor to foundational example 38b
2025-11-06 10:19:10 -05:00
Mark Backman
961e28517e Remove arg from RTVIProcessor 2025-11-06 10:16:31 -05:00
Mark Backman
34d6f3fa00 fix: correct GoogleLLMService token counting 2025-11-06 10:01:37 -05:00
Filipi da Silva Fuchter
616abfd96c Merge pull request #2987 from pipecat-ai/filipi/fix_start_endpoint
Fixing the runner start endpoint when enableDefaultIceServers is enabled.
2025-11-06 09:32:01 -03:00
Mark Backman
d7774ac599 Merge pull request #2991 from pipecat-ai/mb/fix-deepgram-is-connected 2025-11-06 06:35:51 -05:00
Mark Backman
c8c13ecee2 fix: DeepgramSTTService await is_connected() 2025-11-05 21:42:15 -05:00
Vanessa Pyne
314acc104e Merge pull request #2990 from pipecat-ai/vp-fix-riva-default-voice
fix default riva tts voice_id
2025-11-05 18:40:41 -06:00
vipyne
1dfa59257d fix default riva tts voice_id 2025-11-05 18:30:05 -06:00
Mark Backman
376dcc34f7 Merge pull request #2986 from pipecat-ai/mb/docs-0.0.93
Fix docstrings for 0.0.93 release, fix classmethod placement in Reque…
2025-11-05 15:49:09 -05:00
Filipi Fuchter
5ee8c56899 Fixing the runner start endpoint when enableDefaultIceServers is enabled. 2025-11-05 17:36:24 -03:00
kompfner
4397deddc7 Merge pull request #2970 from pipecat-ai/pk/tool-registration-improvements
Assorted tool registration improvements
2025-11-05 15:31:15 -05:00
Paul Kompfner
13d6078ea0 Minor tweak to an example for clarity. 2025-11-05 15:30:01 -05:00
Paul Kompfner
61aec08794 CHANGELOG item ordering tweak 2025-11-05 15:29:58 -05:00
Paul Kompfner
0f69d4aea3 Fixed an issue where GeminiLiveLLMService wasn't consistent in what it would do if if it received an LLMContextFrame (triggered by an LLMRunFrame, say) and there were no user messages in the initial context:
- If the context contained a system message, that message would be converted to a user message and the LLM would respond
- If the system message was provided as a constructor argument, though, no user messages would be sent to the LLM, and the LLM would therefore not respond

Not adding this fix to the CHANGELOG since `GeminiLiveLLMService`'s ability to properly handle context-provided tools and system instruction hasn't been published yet.
2025-11-05 15:29:04 -05:00
Paul Kompfner
84ba628dfb Fix a bug in GeminiLiveLLMService where if only *one* of tools or system instruction was provided in the context, the other wouldn't fall back to using the value provided in the constructor.
Not adding this fix to the CHANGELOG since `GeminiLiveLLMService`'s ability to properly handle context-provided tools and system instruction hasn't been published yet.
2025-11-05 15:29:04 -05:00
Paul Kompfner
9ce33f23b9 Add an example demonstrating MCP usage with a speech-to-speech service (GeminiLiveLLMService) using the pattern of passing in tools in the constructor 2025-11-05 15:29:04 -05:00
Paul Kompfner
75245e1daa Fix a bug in GeminiLiveLLMService where in some circumstances it wouldn't respond after a tool call 2025-11-05 15:29:04 -05:00
Paul Kompfner
24365aeefe CHANGELOG wording fix 2025-11-05 15:29:04 -05:00
Paul Kompfner
29ef0f419f Add typing formalizing MCPClient support for registering tools on an LLMSwitcher in addition to an LLMService. 2025-11-05 15:29:01 -05:00
Paul Kompfner
a9d78bd956 Make it possible to get a ToolsSchema out of an MCPClient without passing in an LLM service.
This allows folks to use `MCPClient` alongside the pattern of passing in tools at LLM init time, a pattern supported by speech-to-speech services such as `GeminiLiveLLMService`.
2025-11-05 15:28:23 -05:00
Paul Kompfner
e6f881bb08 Remove the "needs alternate schema" mechanism in MCPClient, moving the necessary schema massaging into GeminiLLMAdapter instead.
This does a couple of things:
- Makes the `MCPClient` LLM agnostic, setting us up for some upcoming improvements (like making it possible to use with `LLMSwitcher`)
- Makes `GeminiLLMAdapter` more robust, as the schema massaging that was previously only done in `MCPClient` is useful for all tools, not just for MCP-provided ones
2025-11-05 15:28:23 -05:00
Paul Kompfner
bee4165ba4 Add LLMSwitcher.register_direct_function() 2025-11-05 15:28:19 -05:00
Mark Backman
e2f6ce1078 Fix docstrings for 0.0.93 release, fix classmethod placement in RequestHandler 2025-11-05 15:27:16 -05:00
Paul Kompfner
0184493711 Update the service switcher example to illustrate registering tools on all LLMs in a switcher 2025-11-05 15:27:00 -05:00
Aleix Conchillo Flaqué
eb3c4c59fc Merge pull request #2985 from pipecat-ai/revert-2976-aleix/interruption-task-frame-finished-event
Revert "fix interruption task frame context ordering"
2025-11-05 12:25:45 -08:00
Aleix Conchillo Flaqué
d844829538 Revert "fix interruption task frame context ordering" 2025-11-05 12:14:03 -08:00
Mark Backman
11b101e8a6 Merge pull request #2974 from pipecat-ai/mb/language-mapping-improvements
Improve language checking in STT and TTS services
2025-11-05 14:59:41 -05:00
Mark Backman
3db5ab9f23 Merge pull request #2983 from pipecat-ai/mb/bump-fastapi-0.122.0
Bumped the fastapi dependency to <0.122.0
2025-11-05 13:24:23 -05:00
Mark Backman
9a96e4060c Bumped the fastapi dependency to <0.122.0 2025-11-05 13:13:47 -05:00
Aleix Conchillo Flaqué
d826279946 Merge pull request #2976 from pipecat-ai/aleix/interruption-task-frame-finished-event
fix interruption task frame context ordering
2025-11-05 09:53:26 -08:00
Aleix Conchillo Flaqué
e4212fb3c0 tests: add interruption strategies context ordering tests 2025-11-05 09:38:18 -08:00
Aleix Conchillo Flaqué
234aae3091 FrameProcessor: use finished_event for push_interruption_task_frame_and_wait 2025-11-05 09:38:17 -08:00
Aleix Conchillo Flaqué
c33b81bb92 PipelineTask: set finished_event InterruptionFrame is received 2025-11-05 09:37:55 -08:00
Aleix Conchillo Flaqué
a1c07039ee frames: added finished_event to InterruptionFrame/InterruptionTaskFrame 2025-11-05 09:37:55 -08:00
Mark Backman
33be73692f Merge pull request #2979 from thsunkid/feature/whisper-stt-probability-metrics
Feat: Access prob metrics for Whisper STT services using include_prob_metrics
2025-11-05 12:33:24 -05:00
Mark Backman
f6d7b6ae5f Fix SpeechmaticsSTTService: use language checking for language and output_locale 2025-11-05 12:26:52 -05:00
Mark Backman
2ee54c985f Improve language checking in STT and TTS services 2025-11-05 12:26:52 -05:00
Thu Nguyen
76c336644a Merge branch 'main' into feature/whisper-stt-probability-metrics 2025-11-06 00:24:34 +07:00
Thu Nguyen
dd8711dee1 Added changelog 2025-11-06 00:23:42 +07:00
Thu Nguyen
c26c27fe21 Update util with new docs and extract_deepgram_probability 2025-11-06 00:23:20 +07:00
Mark Backman
159dbd078d Merge pull request #2980 from pipecat-ai/mb/gemini-vertex-update
Refactor GoogleVertexLLMService to use GoogleLLMService as a base class
2025-11-05 11:35:50 -05:00
Mark Backman
c18ff999a5 Update GoogleVertexLLMService default model to gemini-2.5-flash 2025-11-05 11:28:41 -05:00
Mark Backman
80d127aaa4 Refactor GoogleVertexLLMService to use GoogleLLMService as a base class 2025-11-05 09:33:02 -05:00
Mark Backman
bbc7d3e2fb Merge pull request #2977 from pipecat-ai/mb/request-handler-smallwebrtc
Fix: support request data in SmallWebRTC
2025-11-05 08:50:31 -05:00
Thu Nguyen
3486d63ef6 Add docs 2025-11-05 13:30:49 +07:00
Thu Nguyen
842c4a3485 Update base_stt 2025-11-05 13:26:59 +07:00
Thu Nguyen
0b779a880b Feat: allow accessing prob metrics for Whisper STT services with include_prob_metrics param 2025-11-05 13:24:49 +07:00
Mark Backman
01f3421052 Fix: support request data in SmallWebRTC 2025-11-04 17:14:29 -05:00
Aleix Conchillo Flaqué
c20aa78648 Merge pull request #2969 from pipecat-ai/aleix/pipecat-observer-files
PipelineTask: load observers from PIPECAT_OBSERVER_FILES
2025-11-04 12:34:37 -08:00
Aleix Conchillo Flaqué
38f27ad991 PipelineTask: load observers from PIPECAT_OBSERVER_FILES 2025-11-04 12:10:53 -08:00
Mark Backman
0c38585034 Merge pull request #2973 from pipecat-ai/mb/cartesia-sonic-3-languages
Add sonic-3 languages to Cartesia TTS services
2025-11-04 14:43:06 -05:00
Mark Backman
8a09bbbf0e Merge pull request #2972 from akash-dutta-dev/hotfix/addCustomParamForExotel
Add customer parameter in Call Data for Exotel
2025-11-04 14:29:58 -05:00
Vanessa Pyne
fb737ff671 Merge pull request #2967 from pipecat-ai/vp-39-bork-83
update example 39-mcp-stdio.py to use different mcp server
2025-11-04 09:02:29 -06:00
vipyne
b7a4d7371c wrap tools = await mcp.register_tools(llm) in try in examples 2025-11-04 09:01:12 -06:00
vipyne
ef88d6a2ea update example 39-mcp-stdio.py to use different mcp server
https://www.loom.com/share/a9f0a270261d4c6cb054ab2b4dcd6084

SO to Rijksmuseum MCP
https://github.com/r-huijts/rijksmuseum-mcp
2025-11-04 09:01:12 -06:00
kompfner
5c1bd8cda2 Merge pull request #2961 from pipecat-ai/pk/gemini-live-fix-session-resumption
Fix Gemini Live session resumption. The problem was that we weren't p…
2025-11-04 09:19:17 -05:00
Paul Kompfner
a82158045a Fix Gemini Live session resumption. The problem was that we weren't properly ignoring send errors during reconnection. 2025-11-04 09:18:40 -05:00
Mark Backman
b1533ddfc4 Add sonic-3 languages to Cartesia TTS services 2025-11-04 07:57:04 -05:00
Mark Backman
0abc699f24 Merge pull request #2964 from pipecat-ai/mb/14j-nim-updates
Fix 14j foundational example
2025-11-04 07:24:53 -05:00
Akash Dutta
09018071e8 Add customer parameter in Call Data for Exotel 2025-11-04 16:57:28 +05:30
Mark Backman
1c53a5fd01 Fix 14j foundational example 2025-11-03 14:57:44 -05:00
kompfner
05d4753d3e Merge pull request #2956 from pipecat-ai/pk/gemini-honor-context-provided-instructions-and-tools
`GeminiLiveLLMService` supports context-provided system instruction a…
2025-11-03 10:38:26 -05:00
Paul Kompfner
87131850bc GeminiLiveLLMService supports context-provided system instruction and tools 2025-11-03 10:30:46 -05:00
Aleix Conchillo Flaqué
af83f45a49 Merge pull request #2959 from pipecat-ai/aleix/cancel-frame-reason
CancelFrame: add reason field to indicate why pipeline is being cancelled
2025-11-02 11:06:58 -08:00
Aleix Conchillo Flaqué
62e45f466a EndFrame: add reason field to indicate why pipeline is being ended 2025-11-02 00:45:27 -07:00
Aleix Conchillo Flaqué
e85e93b9b1 CancelFrame: add reason field to indicate why pipeline is being cancelled 2025-11-02 00:44:47 -07:00
Mark Backman
074d3ff162 Merge pull request #2821 from shreyas-sarvam/sarvam/stt
Sarvam STT/STTT WS implementation
2025-10-31 13:47:27 -04:00
shreyas-sarvam
d680ec2e69 Merge branch 'main' into sarvam/stt 2025-10-31 23:09:47 +05:30
shreyas-sarvam
d905b21f72 fix: Pass input_audio_codec as an __init__ parameter 2025-10-31 23:07:48 +05:30
shreyas-sarvam
6c5d84ca4c fix: Fixes for sample_rate being passed by PipelineParams 2025-10-31 23:03:25 +05:30
Aleix Conchillo Flaqué
334167e3d7 Merge pull request #2953 from pipecat-ai/aleix/pipecat-0.0.92
update CHANGELOG for 0.0.92. 🎃 "The Haunted Edition" 👻
2025-10-31 09:47:25 -07:00
Aleix Conchillo Flaqué
e3531a5f25 update CHANGELOG for 0.0.92. 🎃 "The Haunted Edition" 👻 2025-10-31 09:17:03 -07:00
Mark Backman
343e97666a Merge pull request #2954 from pipecat-ai/mb/runner-meeting-token-properties
Add support for token properties in Daily util and development runner
2025-10-31 12:12:14 -04:00
Mark Backman
653e84321b Add support for token properties in Daily util and development runner 2025-10-31 12:08:53 -04:00
Mark Backman
3585f724c4 Merge pull request #2952 from pipecat-ai/mb/add-daily-room-properties-to-runner
Add support for dailyRoomProperties when calling /start using the dev…
2025-10-31 12:04:42 -04:00
Mark Backman
5fe597d355 Add support for dailyRoomProperties when calling /start using the development runner 2025-10-31 12:01:03 -04:00
Aleix Conchillo Flaqué
67ab3773f6 Merge pull request #2949 from pipecat-ai/aleix/idle-timeout-observer
PipelineTask: add IdleFrameObserver to detect idle pipelines
2025-10-31 08:51:09 -07:00
Mark Backman
c6e12b9358 Merge pull request #2943 from pipecat-ai/mb/deepgram-http
Add DeepgramHttpTTSService
2025-10-31 11:51:06 -04:00
Aleix Conchillo Flaqué
0f5030bafa tests: add unit test to check for idle timeout on swallowed frames 2025-10-31 08:45:56 -07:00
Aleix Conchillo Flaqué
ed93e29850 PipelineTask: add IdleFrameObserver to detect idle pipelines 2025-10-31 08:45:56 -07:00
Mark Backman
7eb880c5e8 Add DeepgramHttpTTSService 2025-10-31 11:39:32 -04:00
Aleix Conchillo Flaqué
4fa0de6660 Merge pull request #2947 from pipecat-ai/aleix/rename-add-to-context
UserImageRawFrame: rename add_to_context to append_to_context
2025-10-31 08:29:49 -07:00
kompfner
396c1bcc13 Merge pull request #2946 from pipecat-ai/pk/deprecate-expect-stripped-words
Deprecate `expect_stripped_words` option from `LLMAssistantAggregatorParams`…
2025-10-31 09:57:20 -04:00
shreyas-sarvam
57f6ae9e50 Merge branch 'main' into sarvam/stt 2025-10-31 17:36:52 +05:30
shreyas-sarvam
2d03e51109 fix: Remove unused imports, use sample_rate from base class 2025-10-31 17:31:59 +05:30
Mark Backman
1e7143e5f3 Merge pull request #2942 from pipecat-ai/mb/speechmatics-tts-changelog
Add SpeechmaticsTTSService, Soniox changes to changelog
2025-10-31 07:43:58 -04:00
Mark Backman
f820c20fa2 Add SpeechmaticsTTSService and SonioxSTTService changes to changelog 2025-10-31 07:41:17 -04:00
Mark Backman
83f395ff8f Merge pull request #2940 from thsunkid/feature/google-tts-chirp-speaking-rate
Add dynamic speaking_rate control for Google TTS Chirp voices
2025-10-31 07:39:05 -04:00
shreyas-sarvam
09a7e08cbf Merge branch 'main' into sarvam/stt 2025-10-31 15:21:09 +05:30
shreyas-sarvam
6f172bba8f feat: Make input parameters accessible to users 2025-10-31 15:17:06 +05:30
shreyas-sarvam
1433df4de2 fix: Fix language param and include suggested way of handling STT response 2025-10-31 13:23:08 +05:30
Thu Nguyen
6ade5617fb addressed comments 2025-10-31 09:53:47 +07:00
Aleix Conchillo Flaqué
685d440206 UserImageRawFrame: rename add_to_context to append_to_context 2025-10-30 15:18:27 -07:00
Paul Kompfner
ac5734d0ed Deprecate expect_stripped_words option from LLMAssistantAggregatorParams, when used with the newer LLMAssistantAggregator, which now handles word spacing automatically.
This commit does not change how it works in the older `LLMAssistantContextAggregator`.
2025-10-30 17:22:47 -04:00
Aleix Conchillo Flaqué
5e00133e64 Merge pull request #2935 from pipecat-ai/aleix/improve-image-and-vision-support
improve image and vision support
2025-10-30 14:09:01 -07:00
Aleix Conchillo Flaqué
42f0490414 examples(foundational): 14-* show how to tell the LLM we are capturing an image 2025-10-30 14:02:17 -07:00
Aleix Conchillo Flaqué
19f046a338 examples(foundational): add 12d-describe-image-moondream 2025-10-30 14:02:17 -07:00
Aleix Conchillo Flaqué
ec95618b94 don't tie UserImageRawFrame with function calls 2025-10-30 14:02:17 -07:00
Aleix Conchillo Flaqué
74fb6e7676 scripts(evals): improve eval prompting 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
8fa6cbac51 examples(foundational): added 14d docstrings 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
a997655eac scripts(evals): simplify eval configuration and allow RunnerArgs body 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
3b3a215155 examples(foundational): re-add 12-* but load image from file 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
e458d3edfe scripts(evals): update 12-* for 14-*-video 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
d7d409df60 examples(foundational): move 12-* to 14-*-video 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
5174b18176 LLMAssistantAggregator: don't mark function calls as completed when receiving user image
Before, when requesting a user image from a function call we had to wait for a
random time before we could indicate the function call was done. This was to
given time to the aggregator to process the image before marking the function
call as completed.

To avoid this, we now wait for the requested image to be received by the LLM
assistant agrgegator (using an asyncio event). Then, we can successfully mark
the function call as completed.
2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
9c5690d670 LLMContext: added support for image messages with URLs 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
e0933e20d2 deprecated UserResponseAggregator 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
ce13155d26 vision(moondream): process VisionImageRawFrame 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
817a485d94 frames: added VisionImageRawFrame 2025-10-30 13:08:15 -07:00
Aleix Conchillo Flaqué
b094418d1e LLMContext: add create_image_message and create_audio_message 2025-10-30 13:08:13 -07:00
Filipi da Silva Fuchter
08a1e09020 Merge pull request #2944 from pipecat-ai/filipi/flux_handlers
New event handlers for the DeepgramFluxSTTService.
2025-10-30 16:40:41 -03:00
Filipi Fuchter
52b33e5106 New event handlers for the DeepgramFluxSTTService. 2025-10-30 16:09:07 -03:00
Mark Backman
5db0871a20 Merge pull request #2873 from matejmarinko-soniox/main
Update model params for Soniox STT
2025-10-30 12:50:30 -04:00
Mark Backman
222c362fa4 Merge pull request #2937 from aaronng91/speechmatics-tts
Add Speechmatics TTS
2025-10-30 12:30:27 -04:00
Aaron Ng
9d509bb409 address changes 2025-10-30 16:25:10 +00:00
shreyas-sarvam
8d0e7e5e16 chore: Add changelog entry, update foundational examples 2025-10-30 19:22:14 +05:30
shreyas-sarvam
e7b8da7a83 feat: Refactor code to include language parameter, model_name and use _handle_transcription method 2025-10-30 19:01:04 +05:30
shreyas-sarvam
35c48a45cf fix: Ruff format 2025-10-30 18:51:18 +05:30
shreyas-sarvam
14a365aa16 fix: Use message handler to handle responses 2025-10-30 17:54:32 +05:30
shreyas-sarvam
779fc0419d Merge branch 'main' into sarvam/stt 2025-10-30 15:50:53 +05:30
Thu Nguyen
057e0c3973 Lint 2025-10-30 17:12:36 +07:00
Thu Nguyen
8a6abdd44b feat: Add dynamic speaking_rate control for Google TTS Chirp voices 2025-10-30 17:09:41 +07:00
Mark Backman
7872fa2e88 Merge pull request #2934 from roshie548/add-cartesia-generation-config
feat: add generation_config support for Cartesia Sonic-3
2025-10-29 23:10:48 -04:00
Roshan
e86c546a1a Merge branch 'main' into add-cartesia-generation-config 2025-10-29 18:31:09 -07:00
Roshan
abf34bcccf address pr comments 2025-10-29 18:29:51 -07:00
Aleix Conchillo Flaqué
56eb633390 Merge pull request #2911 from pipecat-ai/aleix/daily-transport-improve-error-handling
DailyTransport: update start_dialout/start_recording return values
2025-10-29 16:28:10 -07:00
Aleix Conchillo Flaqué
6299b9db87 DailyTransport: trigger "on_error" if transcription fails to start/stop 2025-10-29 16:25:13 -07:00
Aleix Conchillo Flaqué
bcffa590a3 DailyTransport: update start_dialout/start_recording return values 2025-10-29 16:25:13 -07:00
kompfner
8b739aa444 Merge pull request #2889 from pipecat-ai/pk/openai-realtime-universal-llmcontext-2
Support new `LLMContext` pattern with `OpenAIRealtimeLLMService`
2025-10-29 16:54:37 -04:00
Paul Kompfner
8f15980c67 Get rid of unnecessary new task in example file 2025-10-29 16:23:50 -04:00
Paul Kompfner
89e9acf0e1 CHANGELOG and code comment tweaks 2025-10-29 16:21:04 -04:00
Paul Kompfner
ddac24e6c9 Fix a missing space in a warning message 2025-10-29 16:17:05 -04:00
Paul Kompfner
d0f52feba3 OpenAI Realtime needs the assistant context aggregator to have expect_stripped_words=False 2025-10-29 16:15:16 -04:00
Paul Kompfner
8894db4290 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Add warning about no longer pushing `TTSTextFrame`s.
2025-10-29 15:45:06 -04:00
Paul Kompfner
1f96cdf970 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Make `LLMUserAggregator` push the `LLMSetToolsFrame`s, in case a speech-to-speech service that needs to handle the frame itself—like `OpenAIRealtimeLLMService`—is downstream. As far as I can tell, pushing `LLMSetToolsFrame` should otherwise have no unwanted side effects.
2025-10-29 15:43:51 -04:00
Paul Kompfner
0282033208 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Add `LLMContext.get_messages_for_persistent_storage()` for compatibility with `OpenAILLMContext`, to avoid tripping up users who we're unknowingly migrating to using `LLMContext`.
2025-10-29 15:43:51 -04:00
Paul Kompfner
917ea27352 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update `AzureRealtimeLLMService` example (19a) to use new `LLMContext` pattern.
2025-10-29 15:43:51 -04:00
Paul Kompfner
8c03df1463 Update some docstring arg descriptions to be a bit more current or accurate 2025-10-29 15:43:51 -04:00
Paul Kompfner
15aa76efba Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Maintain backward compatibility with functions specified in dict format.
2025-10-29 15:43:51 -04:00
Paul Kompfner
8ac421f8fd Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Remove unused imports.
2025-10-29 15:43:51 -04:00
Paul Kompfner
75b3ea9c96 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Fix tracing.
2025-10-29 15:43:51 -04:00
Paul Kompfner
95be1510ac Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Improve `OpenAIRealtimeLLMAdapter.get_messages_for_logging()`.
2025-10-29 15:43:51 -04:00
Paul Kompfner
df19011080 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Improve warning about transcription frame direction change.
2025-10-29 15:43:51 -04:00
Paul Kompfner
e42cf78e79 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update deprecation versions.
2025-10-29 15:43:51 -04:00
Paul Kompfner
0495de52b6 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Log warning about transcription frame direction change.
2025-10-29 15:43:51 -04:00
Paul Kompfner
9bc02afd0d Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
CHANGELOG tweak.
2025-10-29 15:43:51 -04:00
Paul Kompfner
6140fdb2c9 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
In anticipation of `messages` property being added to `LLMContext` (in another PR), remove warnings about the need to use `get_messages()` instead.
2025-10-29 15:43:51 -04:00
Paul Kompfner
b6a1886dae Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd). 2025-10-29 15:43:51 -04:00
Paul Kompfner
42d0a097c5 Tweaks to 20b example 2025-10-29 15:43:51 -04:00
Paul Kompfner
3761804146 Make OpenAIRealtimeLLMService's websocket send method more resilient. Previously, it was possible for a websocket send attempt to occur during a disconnect. 2025-10-29 15:43:51 -04:00
Paul Kompfner
46e97c57c2 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update 20b example to use new `LLMContext` pattern.
2025-10-29 15:43:51 -04:00
Paul Kompfner
19770b76b4 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Add back file that was removed, when it should've just been deprecated.

Also, fix version numbers in deprecation messages to match the next expected release.
2025-10-29 15:43:51 -04:00
Paul Kompfner
b34461bf93 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd). 2025-10-29 15:43:47 -04:00
Paul Kompfner
bab0aaf585 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update `create_context_aggregator()` (which we're keeping around for backward compatibility) to create a `LLMContextAggregatorPair` rather than OpenAI-Realtime-specific aggregators.
2025-10-29 15:36:58 -04:00
Paul Kompfner
61944d22ef Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Implement sending tool call results to the OpenAI server based on reading context updates. This lets us use the normal assistant context aggregator and not a special OpenAI Realtime subclass that pushes up a special frame for function call results.
2025-10-29 15:36:58 -04:00
Paul Kompfner
47756319be Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Receiving a new context (via a context frame) no longer serves as a signal to reset the conversation. That’s because we’re now receiving new contexts from the user aggregator every time new messages are added, and from the assistant aggregator when function call results come in. The code pattern we're heading towards, of “diffing” each new context with the previous on, sets us up for doing more sophisticated things in the future, like sending specific messages to OpenAI to edit its internally-tracked context.

Also, remove code that was directly modifying context.
2025-10-29 15:36:58 -04:00
Paul Kompfner
5fa56df014 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Update 19b example with new pattern.
2025-10-29 15:36:58 -04:00
Paul Kompfner
8a151235c3 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Deprecate `send_transcription_frames`—transcription frames are now always sent.
2025-10-29 15:36:57 -04:00
Paul Kompfner
ec42f8c24e Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Push `TranscriptionFrame`s upstream, to be handled by the user context aggregator. This will require at least a couple of other changes:
- Updating examples to put transcript processors upstream from `OpenAIRealtimeLLMService`
- Maybe figuring out a way to preserve backward compatibility with existing pipelines that put transcript processors downstream from `OpenAIRealtimeLLMService`
- Updating `OpenAIRealtimeLLMService` to ignore new received context frames, since the upstream user context aggregator will generate those after each newly-added user message; hopefully nobody was reliant on the old behavior of resetting the conversation upon receiving a new context!
2025-10-29 15:36:57 -04:00
Paul Kompfner
29fd17b9ff Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (cont'd).
Avoid pushing `LLMTextFrame` when `OpenAIRealtimeLLMService` is configured to output audio. This avoids duplicate text in assistant messages in context. Conceptually, a speech-to-speech service encapsulates TTS behavior; in a "traditional" pipeline, `LLMTextFrames` are swallowed by the TTS service, so they should similarly not be pushed by a speech-to-speech service. Only. `TTSTextFrame`s should be pushed.
2025-10-29 15:36:57 -04:00
Paul Kompfner
3ea1e357f2 Update OpenAIRealtimeLLMService to work with LLMContext and LLMContextAggregatorPair (initial part of work) 2025-10-29 15:36:57 -04:00
kompfner
351ef617ae Merge pull request #2932 from pipecat-ai/pk/gemini-live-universal-llmcontext
Update `GeminiLLMService` to work with `LLMContext` and `LLMContextAg…
2025-10-29 15:35:13 -04:00
Paul Kompfner
9dafb715c4 Update some deprecation versions 2025-10-29 15:30:43 -04:00
Paul Kompfner
82d494d3d4 Fix a bug in GeminiLiveLLMService related to ending gracefully—i.e. waiting for the bot to stop responding before ending the pipeline—when the service is configured with the TEXT modality 2025-10-29 14:34:02 -04:00
Mark Backman
e893aaa620 Merge pull request #2931 from ivaaan/hume-bugfix
Hume: use Octave v1 if description provided
2025-10-29 13:40:21 -04:00
Paul Kompfner
65c17a698e Whoops - fix a bug in GeminiLiveLLMService where we weren't checking if a tool call result was already handled before reporting it to the LLM 2025-10-29 12:44:00 -04:00
Paul Kompfner
615aae5b95 Fix GeminiLiveLLMService's sending of LLMFullResponseStartFrame and LLMFullResponseEndFrame so that they properly bookend responses.
Properly bookended responses now work with:
- AUDIO modality (validated with 26b example)
- TEXT modality (validated with 26d example)
- AUDIO modality with Vertex AI (validated with 26h example)

It doesn't seem that TEXT modality is supported with Vertex AI, hence the missing "quadrant" of validation.
2025-10-29 12:33:37 -04:00
Aaron Ng
b0acbeffb9 add sm-app param 2025-10-29 16:33:18 +00:00
Ivan A
2f1061f300 Merge branch 'main' into hume-bugfix 2025-10-29 17:06:50 +01:00
ivaaan
9307079af2 upd changelog 2025-10-29 17:05:41 +01:00
Mark Backman
efa64642a4 Merge pull request #2930 from pipecat-ai/mb/simli-constructor-update
Update Simli to align with Pipecat constructor norms
2025-10-29 11:50:11 -04:00
Mark Backman
ede6c32149 Update Simli to align with Pipecat constructor norms 2025-10-29 11:47:23 -04:00
Aaron Ng
4050e8b7dc add speechmatics tts 2025-10-29 14:53:20 +00:00
Roshan
b0f5fc02c4 refactor: use Pydantic BaseModel for GenerationConfig and simplify model_dump()
- Change GenerationConfig from dataclass to Pydantic BaseModel for consistency
- Simplify _build_msg() to use model_dump(exclude_none=True) instead of manual field extraction
- Simplify HTTP run_tts() to use model_dump(exclude_none=True) instead of manual field extraction

This addresses feedback from code review and reduces code duplication.
2025-10-28 18:41:58 -07:00
Aleix Conchillo Flaqué
493d6bf91e Merge pull request #2936 from pipecat-ai/aleix/daily-python-0.21.0
pyproject: update daily-python to 0.21.0
2025-10-28 18:25:25 -07:00
Aleix Conchillo Flaqué
aaebcae2e8 pyproject: update daily-python to 0.21.0 2025-10-28 17:23:37 -07:00
Roshan
408264a0fd docs: update CHANGELOG.md for generation_config feature 2025-10-28 15:16:49 -07:00
Roshan
df8aa3e4b0 feat: add generation_config support for Cartesia Sonic-3
Add GenerationConfig dataclass with volume, speed, and emotion parameters
for Cartesia Sonic-3 TTS models. This enables fine-grained control over
speech generation including volume (0.5-2.0), speed (0.6-1.5), and
emotion (60+ options).

Changes:
- Add GenerationConfig dataclass with proper Google-style docstrings
- Update CartesiaTTSService.InputParams to include generation_config
- Update CartesiaHttpTTSService.InputParams to include generation_config
- Modify _build_msg() to include generation_config in WebSocket messages
- Modify run_tts() to include generation_config in HTTP requests
- Maintain backward compatibility with existing speed and emotion parameters

The legacy speed (literal strings) and emotion (list) parameters remain
available for non-Sonic-3 models.
2025-10-28 15:10:46 -07:00
Mark Backman
4d82a1260b Merge pull request #2933 from pipecat-ai/mb/remove-aiohttp-session-sarvam
Remove aiohttp_session arg from SarvamTTSService
2025-10-28 16:54:56 -04:00
Paul Kompfner
f974c66e12 Update GeminiLLMService to work with LLMContext and LLMContextAggregatorPair 2025-10-28 15:46:28 -04:00
Mark Backman
533372ed37 Remove aiohttp_session arg from SarvamTTSService 2025-10-28 15:39:14 -04:00
ivaaan
a9118eb2cd use Octave 1 if description provided 2025-10-28 20:36:34 +01:00
Aleix Conchillo Flaqué
84ed2468e5 Merge pull request #2924 from pipecat-ai/aleix/daily-transport-remove-join-timeout
DailyTransport: don't timeout prematurely on join
2025-10-28 10:43:28 -07:00
Aleix Conchillo Flaqué
d82d855c20 DailyTransport: don't timeout prematurely on leave 2025-10-28 10:41:19 -07:00
Mark Backman
412ff2a4a1 Merge pull request #2929 from pipecat-ai/mb/cartesia-sonic-3
Update Cartesia's default model to sonic-3
2025-10-28 13:07:28 -04:00
Mark Backman
82ccc160fb Merge pull request #2923 from pipecat-ai/mb/runner-no-proxy-required
Remove development runner requirement for proxy
2025-10-28 11:59:38 -04:00
Mark Backman
9ef60bd468 Update Cartesia's default model to sonic-3 2025-10-28 11:49:54 -04:00
marcus-daily
06e86cc107 Add RTVIProcessor to foundational example 38b 2025-10-28 12:14:23 +00:00
Aleix Conchillo Flaqué
f3c4bf08dd DailyTransport: don't timeout prematurely on join 2025-10-27 17:52:19 -07:00
Mark Backman
f2cfbee3c3 Remove development runner requirement for proxy 2025-10-27 16:18:31 -04:00
Vanessa Pyne
8b063116ab Merge pull request #2921 from pipecat-ai/vp-azure-ex-cleanup
cleanup logger message
2025-10-27 12:59:08 -05:00
vipyne
8096e62b34 cleanup logger message 2025-10-27 11:27:30 -05:00
kompfner
20f4b0e8ff Merge pull request #2914 from pipecat-ai/pk/gemini-function-calling-fixes
Gemini function calling fixes
2025-10-27 09:45:29 -04:00
Paul Kompfner
6feaf91789 Fix a bug in GeminiLLMAdapter's handling of Gemini-specific context messages 2025-10-27 09:42:24 -04:00
Mark Backman
91d3ae07b3 Merge pull request #2915 from Rickaym/fix--rounding-the-edges-of-observer-function-method-deprecation
fix: use correct  property names
2025-10-24 19:42:34 -04:00
Pyae Sone Myo
71841f71ef fix: use correct property names 2025-10-25 00:47:46 +06:30
Paul Kompfner
949b807023 Close genai client more gracefully to avoid printed warnings. We're now following the genai library guidance: https://github.com/googleapis/python-genai?tab=readme-ov-file#close-a-client 2025-10-24 11:36:25 -04:00
Paul Kompfner
4ad15f9a01 Update Gemini service to include function name when sending function responses in context 2025-10-24 11:04:52 -04:00
Paul Kompfner
99d94fc625 Update Gemini service to use "user" role for function responses, as shown in the Gemini docs 2025-10-24 10:05:14 -04:00
Mark Backman
a3d630c0d1 Merge pull request #2908 from pipecat-ai/mb/runner-daily-start-route
fix: add support for DAILY_SAMPLE_ROOM_URL when calling /start for Da…
2025-10-23 14:15:42 -04:00
Mark Backman
04b482c445 Merge branch 'main' into mb/runner-daily-start-route 2025-10-23 14:11:38 -04:00
Mark Backman
b2bce4916f Merge pull request #2900 from pipecat-ai/mb/quickstart-pipecat-cli
Quickstart to use Pipecat CLI
2025-10-23 10:55:42 -04:00
Mark Backman
60e9817f16 fix: add support for DAILY_SAMPLE_ROOM_URL when calling /start for DailyTransport 2025-10-22 16:48:30 -04:00
kompfner
c655d0d313 Merge pull request #2907 from pipecat-ai/mb/service-switcher-updates
ServiceSwitcher updates
2025-10-22 11:23:48 -04:00
Paul Kompfner
ea6e146f2d Update TestServiceSwitcher to exercise targeting system frames only to the active service 2025-10-22 11:14:27 -04:00
Mark Backman
ec890a834f Rename to filter_system_frames 2025-10-22 11:01:33 -04:00
Mark Backman
5b921fc054 fix: FunctionFilter adds block_system_frame arg 2025-10-22 10:53:01 -04:00
Mark Backman
f1040100f4 Update ServiceSwitcher and LLMSwitcher docstrings 2025-10-22 10:51:03 -04:00
Mark Backman
54691ee781 Merge pull request #2904 from pipecat-ai/mb/bump-aws-sdk-bedrock-runtime
Upgrade aws_sdk_bedrock_runtime to v0.1.1
2025-10-22 08:58:48 -04:00
Mark Backman
49239a23c6 Upgrade aws_sdk_bedrock_runtime to v0.1.1 2025-10-21 23:27:38 -04:00
Aleix Conchillo Flaqué
e0c43de13f Merge pull request #2903 from pipecat-ai/aleix/pipecat-0.0.91
update CHANGELOG for 0.0.91
2025-10-21 19:09:23 -07:00
Aleix Conchillo Flaqué
cc4c96d099 update CHANGELOG for 0.0.91 2025-10-21 19:00:51 -07:00
Aleix Conchillo Flaqué
788465cb04 Merge pull request #2901 from pipecat-ai/pk/llmcontext-messages
Add `messages` property to `LLMContext` for usage parity with `OpenAI…
2025-10-21 18:00:33 -07:00
Aleix Conchillo Flaqué
db934eade0 Merge pull request #2891 from pipecat-ai/aleix/daily-pipecat-runner-args
runner: allow starting a bot from Daily's /start endpoint
2025-10-21 17:59:13 -07:00
Mark Backman
0b8c966a11 Merge pull request #2892 from pipecat-ai/mb/aws-llm-claude-fix
fix: AWSBedrockLLMService compatibility for newer Claude models
2025-10-21 20:50:20 -04:00
Mark Backman
5849485bc6 fix: AWSBedrockLLMService compatibility for newer Claude models 2025-10-21 19:47:27 -04:00
Aleix Conchillo Flaqué
459af58540 runner: allow starting a bot from Daily's /start endpoint 2025-10-21 16:28:11 -07:00
Aleix Conchillo Flaqué
576bd67e85 runner: add body field to RunnerArguments 2025-10-21 16:27:48 -07:00
Aleix Conchillo Flaqué
1e8629bf96 runner: allow passing an api_key to configure 2025-10-21 16:27:48 -07:00
Paul Kompfner
776a3526f9 Add messages property to LLMContext for usage parity with OpenAILLMContext.
This wasn't really an issue before, when folks were *knowingly* migrating from `OpenAILLMContext` to `LLMContext`. But in the latest AWS Nova Sonic change, we're swapping it out from under folks, so this kind of compatibility is more important.

For context, the reason we *didn't* offer the `messages` property earlier was to aid in the development of `LLMContext`—we wanted to draw attention to all the places where messages were being read from context, so we could find the places where we might need to pass an argument to the read.
2025-10-21 17:38:39 -04:00
kompfner
2ced044418 Merge pull request #2896 from pipecat-ai/pk/add-back-types-that-were-meant-to-be-deprecated-not-removed
Add back types that were removed, when they should only have been dep…
2025-10-21 17:33:17 -04:00
Mark Backman
151f187837 Merge pull request #2895 from pipecat-ai/mb/update-env-example
Organize the env.example file
2025-10-21 17:15:22 -04:00
Mark Backman
67afa718d0 Merge pull request #2898 from pipecat-ai/mb/ellipses-changelog
Changelog entry for PR #2877
2025-10-21 17:02:08 -04:00
Mark Backman
52ab0eccc0 Quickstart to use Pipecat CLI 2025-10-21 15:57:45 -04:00
Vanessa Pyne
d1f1b68b71 Merge pull request #2863 from pipecat-ai/vp-custom-frame-processor-ex
add 08-custom-frame-processor.py to foundational examples
2025-10-21 14:15:38 -05:00
Mark Backman
a479c32665 Merge pull request #2894 from pipecat-ai/mb/cli-readme
Add Pipecat CLI to README's ecosystem section
2025-10-21 13:20:12 -04:00
Mark Backman
9f66b0ba41 Add Pipecat CLI to README's ecosystem section 2025-10-21 13:17:37 -04:00
vipyne
23385ca3d2 replace foundational example 08-bots-arguing.py with 08-custom-frame-processor.py 2025-10-21 11:56:35 -05:00
vipyne
8b24bae9c5 pr notes 2025-10-21 11:42:06 -05:00
Mark Backman
0502ec6c44 Changelog entry for PR #2877 2025-10-21 11:58:27 -04:00
Mark Backman
81645910e0 Merge pull request #2877 from nimobeeren/patch-1
Add ellipsis character to sentence ending punctuation list
2025-10-21 11:53:17 -04:00
Filipi da Silva Fuchter
d6ab4c41b0 Merge pull request #2897 from pipecat-ai/filipi/fix_proxy_route
Fixed an issue in the runner's proxy_request
2025-10-21 12:28:04 -03:00
Filipi Fuchter
2f92cb8781 Fixed an issue in the runner's proxy_request where a session that exists but has empty data was being treated as invalid. 2025-10-21 11:41:52 -03:00
Paul Kompfner
fbf274374c Add back types that were removed, when they should only have been deprecated 2025-10-21 09:56:31 -04:00
Mark Backman
427efecf5b Organize the env.example file 2025-10-21 09:43:46 -04:00
Filipi da Silva Fuchter
b3e54546ac Merge pull request #2888 from pipecat-ai/filipi/rtvi_duplicated_frames
Fixed an issue where the RTVIProcessor was sending duplicate UserStartedSpeakingFrame and UserStoppedSpeakingFrame messages.
2025-10-21 08:57:32 -03:00
Filipi Fuchter
de46631bac Fixed an issue where the RTVIProcessor was sending duplicate UserStartedSpeakingFrame and UserStoppedSpeakingFrame messages. 2025-10-20 18:39:00 -03:00
vipyne
abf0150261 add 47-custom-frame-processor.py to foundational examples 2025-10-20 12:11:40 -05:00
Aleix Conchillo Flaqué
a0c93ab6de update CHANGELOG cosmetics 2025-10-20 09:07:50 -07:00
Aleix Conchillo Flaqué
4bec566bbf Merge pull request #2885 from pipecat-ai/aleix/daily-python-0.20.0
pyproject: update daily-python to 0.20.0
2025-10-20 08:04:52 -07:00
Aleix Conchillo Flaqué
ec3cd24182 pyproject: update daily-python to 0.20.0 2025-10-20 08:04:34 -07:00
kompfner
e36e64c2e8 Merge pull request #2750 from pipecat-ai/pk/aws-nova-sonic-universal-llmcontext-1
Support new `LLMContext` pattern with `AWSNovaSonicLLMService`
2025-10-20 10:12:53 -04:00
Paul Kompfner
02a88022dd Add a bit more detail to CHANGELOG related to AWSNovaSonicLLMService's support for LLMContext 2025-10-20 10:06:09 -04:00
Paul Kompfner
6cae61f2cc Add a bit more detail to CHANGELOG entry about AWSNovaSonicLLMService's support for LLMContext 2025-10-20 09:50:23 -04:00
Paul Kompfner
3b40079120 Add a detailed warning when trying to import things from pipecat.services.aws_nova_sonic.context or pipecat.services.aws.nova_sonic.context 2025-10-20 09:49:05 -04:00
Paul Kompfner
ff0b38859b Remove AWS Nova Sonic's context.py, which was always concerned with types for internal use only. Now those types are either gone or moved elsewhere. 2025-10-20 09:49:05 -04:00
Paul Kompfner
4d499324d1 Re-apply a change to AWSNovaSonicLLMService that was lost in a rebase 2025-10-20 09:49:05 -04:00
Paul Kompfner
f13e006db2 Bump version in deprecation message in docstring 2025-10-20 09:49:05 -04:00
Paul Kompfner
87d9e8c9cd Re-apply a couple of recent changes to AWSNovaSonicLLMService that were lost in a rebase 2025-10-20 09:49:05 -04:00
Paul Kompfner
4820f1c059 Address some AWSNovaSonicLLMService context-recording edge cases 2025-10-20 09:49:05 -04:00
Paul Kompfner
860c39d1b1 Get rid of LLMContext.get_messages_for_persistent_storage().
The reason for its `system_instruction` argument was to support usage with LLMs where you might pass the system instruction as a parameter to the `LLMService` rather than specifying it in the context.

But as I thought about it more I became unconvinced that the `system_instruction` argument was really beneficial:

- If you specified your system instruction in your context in the first place, it'll still be there when you read messages for persistent storage
- If you didn't specify your system instruction in the context and instead passed it in as an `LLMService` parameter, you most likely *don't* want it to be in the context when you read messages for persistent storage
- ...and if you really really do need to inject it at the start of the context, it's quite easy to do anyway

And if we remove the `system_instruction` argument from `get_messages_for_persistent_storage()`, then it's essentially just `get_messages()`.
2025-10-20 09:49:05 -04:00
Paul Kompfner
ae5c5ed7f6 Update AWSNovaSonicLLMService to work with LLMContext and LLMContextAggregatorPair 2025-10-20 09:49:00 -04:00
shreyas-sarvam
5052da8ce6 Merge branch 'main' into sarvam/stt 2025-10-20 13:45:24 +05:30
Aleix Conchillo Flaqué
7aa01c1ca8 Merge pull request #2882 from pipecat-ai/aleix/base-transport-output-cleanup
base output transport cleanup
2025-10-18 07:38:13 -07:00
Mark Backman
4d6356748f Merge pull request #2819 from shreyas-sarvam/sarvam/tts-v3
feat: Add support for bulbul:v3
2025-10-18 09:36:57 -04:00
Mark Backman
5b1a182421 Merge branch 'main' into sarvam/tts-v3 2025-10-18 09:34:10 -04:00
Mark Backman
6ac0c34413 Merge pull request #2879 from sam-s10s/fix/smx-vocab
Fix for SpeechmaticsSTTService AdditionVocabEntry entries
2025-10-18 09:27:23 -04:00
Mark Backman
c115422dbf Merge pull request #2857 from dan-ince-aai/main
feat: add keyterms_prompt to AssemblyAI service
2025-10-18 09:20:43 -04:00
Mark Backman
a2a973be27 Merge pull request #2842 from nbyers-altira/fix-riva-segmented
Fix NVIDIA Riva Segmented STT by adding missing is_final parameter to _handle_transcription
2025-10-18 09:11:11 -04:00
Aleix Conchillo Flaqué
0407744950 BaseOutputTransport: simplify process_frame 2025-10-17 21:55:20 -07:00
Aleix Conchillo Flaqué
7ce370ccc6 BaseOutputTransport: simplify bot speaking logic 2025-10-17 15:13:20 -07:00
nbyers-altira
a4867f61aa be a tad more precise in changelog 2025-10-17 13:51:49 -04:00
nbyers-altira
a67a765783 add changelog, run linter 2025-10-17 13:49:52 -04:00
nbyers-altira
81221668b1 Merge remote-tracking branch 'upstream/main' into fix-riva-segmented 2025-10-17 13:45:59 -04:00
Daniel Ince
cc9c264940 Merge branch 'main' into main 2025-10-17 15:15:36 +01:00
Sam Sykes
f2c61ac9fd Fix for AdditionVocabEntry without sounds_like items. 2025-10-17 14:34:37 +01:00
Filipi da Silva Fuchter
88f8c10f63 Merge pull request #2875 from pipecat-ai/filipi/rtvi_routes
Creating the WebRTC routes that mimic the ones provided by Pipecat Cloud.
2025-10-17 10:13:45 -03:00
Filipi Fuchter
855f4842dd Creating the WebRTC routes that mimic the ones provided by Pipecat Cloud. 2025-10-17 10:10:19 -03:00
Filipi da Silva Fuchter
2bf44fe2af Merge pull request #2853 from pipecat-ai/filipi/trickle_ice
Adding support for trickle ice.
2025-10-17 09:00:32 -03:00
Filipi Fuchter
3e8a7cc254 Adding support for trickle ICE to the SmallWebRTCTransport. 2025-10-17 08:57:45 -03:00
Daniel Ince
a600c05570 Merge branch 'main' into main 2025-10-17 11:43:38 +01:00
dan-ince-aai
3ba6b55659 feat: multilingual + changelog updates 2025-10-17 11:38:03 +01:00
dan-ince-aai
d5f2dcfac0 lint 2025-10-17 11:32:06 +01:00
Nimo Beeren
d1d74c571c add ellipsis character to sentence ending punctuation list 2025-10-17 10:38:06 +02:00
shreyas-sarvam
d12134038b chore: Update CHANGELOG 2025-10-17 10:07:58 +05:30
shreyas-sarvam
a22af3a7e0 Merge branch 'main' into sarvam/stt 2025-10-17 10:00:49 +05:30
Aleix Conchillo Flaqué
76e07c6c48 Merge pull request #2870 from pipecat-ai/aleix/openaitts-update-settings
OpenAITTSService: allow updating instructions and speed
2025-10-16 13:21:12 -07:00
Aleix Conchillo Flaqué
8d8503bca7 OpenAITTSService: allow updating instructions and speed 2025-10-16 13:20:49 -07:00
Aleix Conchillo Flaqué
a444097060 Merge pull request #2872 from pipecat-ai/aleix/pipeline-task-cancellation-fixes
PipelineTask: fix task cancellation issues
2025-10-16 13:18:13 -07:00
Aleix Conchillo Flaqué
1b9e96c016 PipelineTask: fix task cancellation issues 2025-10-16 13:16:19 -07:00
Vanessa Pyne
7967bc53c3 Merge pull request #2868 from pipecat-ai/vp-whatsapp-dep-mv
only import whatsapp deps if using whatsapp runner
2025-10-16 14:16:28 -05:00
vipyne
6381335346 Add --whatsapp flag to runner 2025-10-16 14:15:26 -05:00
vipyne
0fd5d26104 add WHATSAPP_APP_SECRET to required whatsapp env vars 2025-10-16 10:37:56 -05:00
vipyne
41f817bf04 only import whatsapp deps if using whatsapp runner 2025-10-16 10:37:56 -05:00
Matej Marinko
9acc36c58e Update model params for Soniox STT
- remove deprecated parameters and add new ones
- add support for v3 context
2025-10-16 08:51:40 +02:00
shreyas-sarvam
27115e6565 Merge branch 'main' into sarvam/tts-v3 2025-10-16 12:09:50 +05:30
shreyas-sarvam
1ecf6e05fe Merge branch 'main' into sarvam/stt 2025-10-16 12:08:32 +05:30
Mark Backman
3c4807d7d4 Merge pull request #2859 from pipecat-ai/mb/openai-package-upgrade
Bump openai, openpipe versions, add 14x foundational example
2025-10-15 15:41:32 -04:00
Mark Backman
8902f1dc94 Bump openai, openpipe versions, add 14x foundational example 2025-10-15 15:17:22 -04:00
Mark Backman
a25333ee51 Merge pull request #2856 from pipecat-ai/mb/pr-2840-cleanup
Fix an issue in ElevenLabsHttpTTSService where the last word is not e…
2025-10-15 15:16:43 -04:00
Mark Backman
82c7d7ad83 Merge pull request #2867 from pipecat-ai/mb/update-moondream-readme
Update moondream chatbot README link
2025-10-15 15:16:19 -04:00
Mark Backman
ba2ab51ef7 Merge pull request #2866 from pipecat-ai/mb/add-sentry-foundational
Add foundation 47-sentry-metrics.py
2025-10-15 15:15:52 -04:00
Mark Backman
22557fa668 Fix an issue in ElevenLabsHttpTTSService where the last word is not emitted 2025-10-15 15:13:54 -04:00
Vanessa Pyne
3fbf59e7c6 Merge pull request #2864 from pipecat-ai/vp-trace-log
WhatsApp transport debug log -> trace log
2025-10-15 13:03:58 -05:00
vipyne
129ab5ea0e WhatsApp transport debug log -> trace log 2025-10-15 13:02:57 -05:00
Aleix Conchillo Flaqué
dc917523d0 Merge pull request #2855 from pipecat-ai/aleix/stt-tts-connected-disconnected-events
services: added on_connected/on_disconnected events
2025-10-15 10:41:38 -07:00
Aleix Conchillo Flaqué
5ea7cc9d32 services: added on_connected/on_disconnected events 2025-10-15 10:39:31 -07:00
Mark Backman
e11ede475b Update moondream chatbot README link 2025-10-15 13:22:56 -04:00
Mark Backman
90d29e04af Merge pull request #2861 from pipecat-ai/mb/11labs-http-apply-text-normalization-fix
fix: set apply_text_normalization as request parameter in ElevenLabsH…
2025-10-15 12:59:36 -04:00
Mark Backman
4c67136a8d Merge pull request #2858 from pipecat-ai/mb/daily-runner-room-properties
Add room_properties to the Daily runner configure() method
2025-10-15 12:58:18 -04:00
Mark Backman
9d78402a33 fix: set apply_text_normalization as request parameter in ElevenLabsHttpTTSService 2025-10-15 12:56:42 -04:00
Mark Backman
73877218e9 Add room_properties to the Daily runner configure() method 2025-10-15 12:55:19 -04:00
Mark Backman
6a1be90cbb Merge pull request #2862 from pipecat-ai/mb/11labs-http-aggregate-sentences
Add aggregate_sentences arg to ElevenLabsHttpTTSService
2025-10-15 12:54:23 -04:00
Aleix Conchillo Flaqué
fbac959ecb Merge pull request #2865 from pipecat-ai/aleix/stop-audio-filter-also-on-cancel
BaseInputTransport: stop audio filter on cancel
2025-10-15 09:53:24 -07:00
Aleix Conchillo Flaqué
18dd85431c Merge pull request #2854 from pipecat-ai/aleix/cartesia-stt-service-websocket
CartesiaSTTService to inherit from WebsocketSTTService
2025-10-15 09:51:42 -07:00
Aleix Conchillo Flaqué
abc569b3d2 examples(foundational/07): use CartesiaSTTService 2025-10-15 09:46:57 -07:00
Mark Backman
fa5d4ecf86 Add foundation 47-sentry-metrics.py 2025-10-15 12:45:07 -04:00
Aleix Conchillo Flaqué
83b0dc39f7 BaseInputTransport: stop audio filter on cancel 2025-10-15 09:22:48 -07:00
Mark Backman
0c31b5ef19 Add aggregate_sentences arg to ElevenLabsHttpTTSService 2025-10-15 11:49:26 -04:00
dan-ince-aai
d16c36c56d feat: add keyterms_prompt to AssemblyAI service 2025-10-15 14:27:52 +01:00
Mark Backman
8fe3bcd484 Merge pull request #2840 from Rickaym/fix--excess-space-in-elevelabs-word-timestamp-joins
fix: handle ElevenLabs partial word concatenation across alignment chunks gracefully
2025-10-15 09:01:05 -04:00
Aleix Conchillo Flaqué
be2858bfbb CartesiaSTTService: inherit from WebsocketSTTService 2025-10-14 14:14:57 -07:00
Pyae Sone Myo
b6b0997553 fix: add support for partial words 2025-10-14 23:06:13 +06:30
Pyae Sone Myo
3b751322d3 fix: add interruption reset for partial word states 2025-10-14 23:04:09 +06:30
Aleix Conchillo Flaqué
fce6f55ddb Merge pull request #2844 from pipecat-ai/aleix/runner-files-path
runner: allow subdirectories in --folder
2025-10-14 08:38:38 -07:00
Aleix Conchillo Flaqué
d9580f72a9 runner: allow subdirectories in --folder 2025-10-13 18:29:19 -07:00
nbyers-altira
cc66ac14f1 add is_final to segmented func. sig. instead so tracing is consistent 2025-10-13 10:48:41 -04:00
nbyers-altira
9ddec0f8b4 is_final is not part of the segmented _handle_transcription function signature 2025-10-13 10:44:25 -04:00
shreyas-sarvam
5cc1d8a024 refactor: Update dependencies and improve logging 2025-10-13 10:18:15 +05:30
shreyas-sarvam
9babfe9fd9 refactor: Improve code reability and replace deprecated interruption frames 2025-10-13 08:54:29 +05:30
Pyae Sone Myo
21d8d148b8 fix: handle partial words across alignment chunks gracefully 2025-10-12 22:10:11 +06:30
Aleix Conchillo Flaqué
0588c82bbf Merge pull request #2838 from makosst/manta_graph_readme
Added Manta Graph to README
2025-10-11 09:31:21 -07:00
makosst
16e9093d5a Added Manta Graph to README 2025-10-11 09:20:17 -07:00
Aleix Conchillo Flaqué
91a5d580fd Merge pull request #2835 from pipecat-ai/aleix/tts-http-aligned-audio-frames
tts: fix RimeHttpTTSService/PiperTTSService 16-bit audio frames alignment
2025-10-10 14:20:44 -07:00
Aleix Conchillo Flaqué
0473556992 tts: fix RimeHttpTTSService/PiperTTSService 16-bit audio frames alignment 2025-10-10 14:19:22 -07:00
Aleix Conchillo Flaqué
fdaa4e476e Merge pull request #2830 from pipecat-ai/aleix/pipecat-0.0.90
update CHANGELOG for 0.0.90
2025-10-10 10:22:26 -07:00
Aleix Conchillo Flaqué
502e7e42a7 update CHANGELOG for 0.0.90 2025-10-10 10:20:19 -07:00
kompfner
2ab3d4fb42 Merge pull request #2834 from pipecat-ai/pk/update-vertex-disclaimer
Update a Google Vertex disclaimer for accuracy
2025-10-10 13:19:51 -04:00
Paul Kompfner
55014bdd77 Update a Google Vertex disclaimer for accuracy 2025-10-10 13:18:03 -04:00
kompfner
334796bd65 Merge pull request #2833 from pipecat-ai/pk/vertex-non-optional-location
`location` should not be optional when using Google Vertex.
2025-10-10 13:02:40 -04:00
Paul Kompfner
1c25b6fb72 location should not be optional when using Google Vertex.
Also, update `GoogleVertexLLMService` initialization pattern in the example file.
2025-10-10 12:58:45 -04:00
Mark Backman
91b29de7ca Merge pull request #2832 from pipecat-ai/mb/docs-fixes-0.0.90
Docs fixes for 0.0.90 release
2025-10-10 12:46:40 -04:00
Mark Backman
21d610cd30 Docs fixes for 0.0.90 release 2025-10-10 12:43:31 -04:00
Mark Backman
f7fe673ad1 Merge pull request #2831 from pipecat-ai/mb/update-evals
Update release evals for OpenAI Realtime, Gemini Live Vertex; shorten…
2025-10-10 12:34:27 -04:00
Mark Backman
4b415721e2 Update release evals for OpenAI Realtime, Gemini Live Vertex; shorten 26 foundational names 2025-10-10 12:26:23 -04:00
kompfner
8d2a98e0e7 Merge pull request #2829 from pipecat-ai/pk/fix-gemini-live-deprecation-messages
Fix deprecation messages pointing users to the new import paths for G…
2025-10-10 10:42:15 -04:00
Paul Kompfner
523e890c8c Fix deprecation messages pointing users to the new import paths for Gemini Live 2025-10-10 10:30:38 -04:00
kompfner
3c748fe772 Merge pull request #2823 from pipecat-ai/pk/vertex-init-args-fixup
Move `location` and `project_id` out of `InputParams` in `GoogleVerte…
2025-10-10 10:18:51 -04:00
kompfner
d293cee372 Merge pull request #2822 from pipecat-ai/pk/make-pause-processing-frames-more-robust
Make `pause_processing_frames()` and `pause_processing_system_frames(…
2025-10-10 10:16:27 -04:00
Paul Kompfner
8b62a96878 Improve how we're deprecating location and project_id in GoogleVertexLLMService, allowing user code to (correctly) continue referring to GoogleVertexLLMService.InputParams.
Also fix the slightly wrong (but so far harmless) pattern of initializing `OpenAILLMService.InputParams()` in the `GoogleVertexLLMService` if `params` wasn't provided—we should be letting the superclass decide what to do if the argument isn't specified.
2025-10-10 10:12:00 -04:00
Mark Backman
0c102ce70b Merge pull request #2826 from pipecat-ai/mb/deprecate-livekit-frame-serializer
Deprecate LivekitFrameSerializer
2025-10-10 10:01:45 -04:00
Mark Backman
3894d2a4b9 Deprecate LivekitFrameSerializer 2025-10-10 09:51:57 -04:00
Aleix Conchillo Flaqué
1f6b61c0db Merge pull request #2828 from pipecat-ai/aleix/gemini-live-gemini-to-llm
google: rename google.gemini_live.gemini to google.gemini_live.llm
2025-10-10 06:42:51 -07:00
Aleix Conchillo Flaqué
8ee28b37cd google: rename google.gemini_live.vertext to google.gemini_live.llm_vertex 2025-10-10 06:41:19 -07:00
Filipi da Silva Fuchter
e85e7e4d84 Merge pull request #2773 from pipecat-ai/filipi/krisp_viva
Added audio filter `KrispVivaFilter` using the Krisp VIVA SDK.
2025-10-10 09:51:15 -03:00
Filipi Fuchter
1b3afb5511 Added audio filter KrispVivaFilter using the Krisp VIVA SDK 2025-10-10 09:44:47 -03:00
Aleix Conchillo Flaqué
7cec013666 google: rename google.gemini_live.gemini to google.gemini_live.llm 2025-10-09 22:20:09 -07:00
Aleix Conchillo Flaqué
86127167fb Merge pull request #2827 from pipecat-ai/aleix/openai-realtime-move
move openai_realtime to openai.realtime
2025-10-09 22:18:04 -07:00
Aleix Conchillo Flaqué
9935a68018 examples(19b): fix deprecations 2025-10-09 22:14:52 -07:00
Aleix Conchillo Flaqué
5679dde70f ai_service: use openai.realtime.events instead of openai_realtime_beta.events 2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué
d81b0f6368 update CHANGELOG with openai_realtime deprecation 2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué
9698b008da deprecate openai_realtime 2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué
7b05c9283b move openai.realtime.azure to azure.realtime.llm 2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué
303dd2ec35 move openai.realtime.openai to openai.realtime.llm 2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué
aa6e81648a move openai_realtime to openai.realtime 2025-10-09 22:14:46 -07:00
Aleix Conchillo Flaqué
1a87870ef3 Merge pull request #2825 from pipecat-ai/aleix/aws-nova-sonic-move
move aws_nova_sonic to aws.nova_sonic
2025-10-09 18:37:46 -07:00
Aleix Conchillo Flaqué
aac4ce2d12 update CHANGELOG with aws_nova_sonic deprecation 2025-10-09 18:32:26 -07:00
Aleix Conchillo Flaqué
2a79b2c853 aws: deprecate aws_nova_sonic 2025-10-09 17:44:29 -07:00
Aleix Conchillo Flaqué
15bf5b1533 aws: move aws_nova_sonic to aws.nova_sonic 2025-10-09 17:35:47 -07:00
Aleix Conchillo Flaqué
cdc86db8ce update CHANGELOG with GoogleVertexLLMService token fix 2025-10-09 16:58:22 -07:00
Aleix Conchillo Flaqué
9d2ad750b5 Merge pull request #2779 from LucasStringPay/patch-1
Ignore None value for 'completion_tokens' or similar for Gemini
2025-10-09 16:55:33 -07:00
Aleix Conchillo Flaqué
19ceb1a48f Merge pull request #2817 from pipecat-ai/aleix/runner-download-folder
runner: add --folder argument to allow file downloads
2025-10-09 16:55:17 -07:00
Aleix Conchillo Flaqué
59217eae38 runner: add --folder argument to allow file downloads 2025-10-09 16:49:51 -07:00
Aleix Conchillo Flaqué
bea0aee835 Merge pull request #2824 from pipecat-ai/aleix/gemini-under-google
google: move gemini_live inside google service
2025-10-09 16:40:15 -07:00
Aleix Conchillo Flaqué
aeace9b9be google: move gemini_live inside google service 2025-10-09 16:06:42 -07:00
Paul Kompfner
2994640f47 Move location and project_id out of InputParams in GoogleVertexLLMService, making them top-level init args instead. We do this for two reasons:
- Conceptually, these args comprise project-level setup, akin to credentials, whereas everything in `InputParams` is concerned with model configuration
- Providing a `project_id` when initializing `GoogleVertexLLMService` should not be optional, but prior to the change in this commit it was (erroneously) treated as optional by dint of `InputParams` being optional

This improvement was discussed [in this comment](https://github.com/pipecat-ai/pipecat/pull/2795#discussion_r2408279142).
2025-10-09 16:53:21 -04:00
Paul Kompfner
10069719e4 Make pause_processing_frames() and pause_processing_system_frames() more robust in FrameProcessor.
To understand this fix, let's look exclusively at `pause_processing_frames()` (`pause_processing_system_frames()` works the same way).

`pause_processing_frames()` works by setting a `__should_block_frames` flag, which is then read each time through the loop in the long-running `__process_frame_task_handler`. if `__should_block_frames` is `True`, it pauses processing frames until it's resumed.

Prior to this fix, the check for `__should_block_frames` was before `await self.__process_queue.get()`. The problem is that a lot of the time spent in the loop is waiting for a frame from the process queue. So if `pause_processing_frames()` is set at any time other than within `process_frame()` itself, it actually won't have an effect by the next frame, only on the frame *after* the next, which is later than intended.

Because thus far in the Pipecat codebase we've only ever called `pause_processing_frames()` and `pause_processing_system_frames()` from within `process_frame()`, this change should have no behavioral effect. But it will be helpful if we ever need to call it from anywhere else. I noticed this issue while developing a feature that did exactly that (though I later abandoned that code).
2025-10-09 15:57:31 -04:00
shreyas-sarvam
1e31fc7f9b fix: Format errors 2025-10-09 22:09:25 +05:30
kompfner
046b76df60 Merge pull request #2820 from pipecat-ai/pk/gemini-live-vertex-support
Support Gemini Live + Vertex AI
2025-10-09 11:53:41 -04:00
Paul Kompfner
f2d9063984 Renames: remove "multimodal" from Gemini Live types 2025-10-09 10:58:36 -04:00
shreyas-sarvam
7c1e2793c5 feat: Add support for bulbul:v3 and bulbul:v3-beta 2025-10-09 18:26:22 +05:30
Paul Kompfner
99f008e927 Make a note in our examples that there's an issue with Gemini Live + Vertex around specifying a modality other than AUDIO 2025-10-08 21:03:07 -04:00
Paul Kompfner
2699f0c2a6 Fix tool calls when using Gemini Live + Vertex AI 2025-10-08 21:03:07 -04:00
Paul Kompfner
0b6dd98000 Make a note in our examples that there's an issue with Gemini Live + Vertex around using "google_search" alongside other tools 2025-10-08 21:03:07 -04:00
Paul Kompfner
a14fb20d15 Fix Gemini Live w/Vertex AI not being able to handle an empty list provided for "function_declarations" 2025-10-08 21:03:07 -04:00
Paul Kompfner
728361a6a7 Add GeminiVertexMultimodalLiveLLMService 2025-10-08 21:03:01 -04:00
kompfner
106db69e8e Merge pull request #2816 from pipecat-ai/pk/gemini-live-await-ongoing-response-after-endframe
Implement ending `GeminiMultimodalLiveLLMService` gracefully (i.e. af…
2025-10-08 17:20:14 -04:00
Paul Kompfner
cf90071926 Format fix 2025-10-08 17:19:46 -04:00
Paul Kompfner
deaeb75a1f Fix changelog after rebase (and add a missing item) 2025-10-08 17:16:31 -04:00
Paul Kompfner
a666327d70 Implement ending GeminiMultimodalLiveLLMService gracefully (i.e. after the bot is finished) 2025-10-08 17:13:04 -04:00
kompfner
13a0522546 Merge pull request #2804 from pipecat-ai/pk/gemini-live-session-resumption
Add (relatively spartan) reconnection logic to `GeminiMultimodalLiveLLMService`
2025-10-08 17:10:45 -04:00
Paul Kompfner
7da37a0d1f Pull _connection_established_threshold and _max_consecutive_failures into file-level constants 2025-10-08 17:04:05 -04:00
Paul Kompfner
7efb22a323 Add (relatively spartan) reconnection logic to GeminiMultimodalLiveLLMService, leveraging the Gemini Live session resumption mechanism 2025-10-08 16:53:21 -04:00
kompfner
8084e2f909 Merge pull request #2776 from pipecat-ai/pk/gemini-live-gen-ai-library
Gemini Live service uses the `genai` library rather than WebSockets directly
2025-10-08 16:50:16 -04:00
Paul Kompfner
86127c6a6e Add to the changelog the GeminiMultimodalLiveLLMService update to use google-genai 2025-10-08 16:46:41 -04:00
Paul Kompfner
402e019ae2 Make a bit of code clearer 2025-10-08 16:45:55 -04:00
Paul Kompfner
f09e4e238b Fix some mishandling of enum values 2025-10-08 16:45:55 -04:00
Paul Kompfner
2921162b3b Add deprecation warning around importing StartSensitivity and EndSensitivity from pipecat.services.gemini_multimodal_live.events 2025-10-08 16:45:55 -04:00
Paul Kompfner
ac1582c906 Let users directly use google-genai types rather than aliased re-exported types 2025-10-08 16:45:55 -04:00
Paul Kompfner
e4b01a5844 Bumping deprecation version of GeminiMultimodalLiveLLMService's base_url arg 2025-10-08 16:45:55 -04:00
Paul Kompfner
fa663abbbc Add CHANGELOG entry for new GeminiMultimodalLiveLLMService configuration options 2025-10-08 16:45:55 -04:00
Paul Kompfner
d19e6111c3 Bumping deprecation version of GeminiMultimodalLiveLLMService's base_url arg 2025-10-08 16:45:55 -04:00
Paul Kompfner
8a6d504a7e Add enable_affective_dialog and proactivity settings to GeminiMultimodalLiveLLMService 2025-10-08 16:45:55 -04:00
Paul Kompfner
43915937f2 Update how we import and re-export some types in GeminiMultimodalLiveLLMService 2025-10-08 16:45:55 -04:00
Paul Kompfner
48e92a22fe Add thinking settings to GeminiMultimodalLiveLLMService 2025-10-08 16:45:55 -04:00
Paul Kompfner
566af6b0b8 Minor comment improvement 2025-10-08 16:45:55 -04:00
Paul Kompfner
12e7613d5f Deprecate the base_url argument to GeminiMultimodalLiveLLMService.
It expected a WebSocket URL, but we're no longer (directly) using WebSockets to talk to Gemini. Instead of trying to (potentially erroneously) map a given custom WebSocket URL to an `HttpOptions` object (the new preferred way of customizing requests made by the Gemini API client), we're simply deprecating `base_url` and pointing users to the `http_options` argument instead.
2025-10-08 16:45:55 -04:00
Paul Kompfner
04a68f2c57 Fix tracing in GeminiMultimodalLiveLLMService 2025-10-08 16:45:55 -04:00
Paul Kompfner
9b4ca12f49 Revert to only supporting providing a single modality - looks like specifying a list of modalities results in an API error.
Also, fix some missing `await`s in error handling.
2025-10-08 16:45:55 -04:00
Paul Kompfner
453ce715a6 Add some error handling to GeminiMultimodalLiveLLMService 2025-10-08 16:45:55 -04:00
Paul Kompfner
d87b6189ba Update GeminiMultimodalLiveLLMService to use the google-genai library, which is the new recommended approach for interfacing with Gemini Live. 2025-10-08 16:45:55 -04:00
Mark Backman
8293347b77 Merge pull request #2814 from pipecat-ai/mb/openai-service-tier
Add service_tier to BaseOpenAILLMService
2025-10-08 16:44:27 -04:00
Mark Backman
c85a3f0b94 Add service_tier to BaseOpenAILLMService 2025-10-08 16:33:36 -04:00
Aleix Conchillo Flaqué
233fb25e6c Merge pull request #2810 from pipecat-ai/aleix/on-pipeline-error
PipelineTask: add on_pipeline_error event
2025-10-08 11:26:46 -07:00
Aleix Conchillo Flaqué
080978daa6 Merge pull request #2808 from pipecat-ai/aleix/readme-pipecat-tv
README: add Pipecat TV reference
2025-10-08 11:26:17 -07:00
Aleix Conchillo Flaqué
62b7c3d3b2 PipelineTask: add on_pipeline_error event 2025-10-07 18:36:38 -07:00
Mark Backman
4b2379cba8 Merge pull request #2798 from ivaaan/hume-rtvi
Hume add RTVI
2025-10-07 21:20:50 -04:00
Aleix Conchillo Flaqué
92087bdfa8 update CHANGELOG 2025-10-07 18:08:18 -07:00
Aleix Conchillo Flaqué
617919ac09 Merge pull request #2809 from pipecat-ai/aleix/revert-interruption-strategies-ordering
revert interruption strategies ordering
2025-10-07 18:07:07 -07:00
Aleix Conchillo Flaqué
0669daec3d update CHANGELOG for 0.0.89 2025-10-07 17:44:10 -07:00
Aleix Conchillo Flaqué
7c15a8c800 Revert "fix context order when using interruption strategies"
This reverts commit de8ee96927.
2025-10-07 17:42:35 -07:00
Aleix Conchillo Flaqué
066b77fba0 README: add Pipecat TV reference 2025-10-07 15:01:28 -07:00
Aleix Conchillo Flaqué
d9aef5f916 some last release CHANGELOG updates 2025-10-07 14:29:27 -07:00
Aleix Conchillo Flaqué
91ae3f8a9b Merge pull request #2807 from pipecat-ai/aleix/pipecat-0.0.88
update CHANGELOG for 0.0.88
2025-10-07 14:16:05 -07:00
Aleix Conchillo Flaqué
36da623352 update CHANGELOG for 0.0.88 2025-10-07 14:12:12 -07:00
Filipi da Silva Fuchter
31b9087ea6 Merge pull request #2805 from pipecat-ai/filipi/allowing_update_smallwebrtc_properties
Allowing to update smallwebrtc and whatsapp properties.
2025-10-07 17:57:26 -03:00
Mark Backman
1851fed22e Merge pull request #2806 from pipecat-ai/mb/deprecate-play-ht
Deprecate PlayHT TTS services
2025-10-07 16:44:53 -04:00
Mark Backman
eddce460da Deprecate PlayHT TTS services 2025-10-07 16:40:01 -04:00
Filipi Fuchter
da4f30cb6d Allowing to update smallwebrtc and whatsapp properties. 2025-10-07 17:28:14 -03:00
Mark Backman
250cf2d8f1 Merge pull request #2803 from pipecat-ai/mb/fix-11labs-stt-deprecation
Remove deprecation warning for ElevenLabsSTTService
2025-10-07 13:04:12 -04:00
Mark Backman
7bbdb4f991 Remove deprecation warning for ElevenLabsSTTService 2025-10-07 12:32:32 -04:00
Mark Backman
051c4782fb Merge pull request #2802 from pipecat-ai/mb/fix-aws-nova-sonic
Fix AWS Nova Sonic authentication
2025-10-07 10:46:03 -04:00
Mark Backman
b1ccec74b2 Fix AWS Nova Sonic authentication 2025-10-07 09:48:18 -04:00
Filipi da Silva Fuchter
92bf0d9eda Merge pull request #2794 from pipecat-ai/filipi/verifying_whatsapp_signature
Verifying WhatsApp signature to ensure the webhook request is from WhatsApp.
2025-10-07 08:57:47 -03:00
Aleix Conchillo Flaqué
f985550441 Merge pull request #2796 from pipecat-ai/aleix/fix-interruption-strategies-context-order
fix context order when using interruption strategies
2025-10-06 22:46:31 -07:00
Aleix Conchillo Flaqué
de8ee96927 fix context order when using interruption strategies 2025-10-06 22:43:01 -07:00
Aleix Conchillo Flaqué
2576d0f340 Merge pull request #2792 from pipecat-ai/aleix/google-nano-banana
GoogleLLMService: added support for image generation
2025-10-06 22:42:14 -07:00
ivaaan
f38f4711ac wip 2025-10-06 20:24:27 -07:00
ivaaan
c2f3ddd329 add RTVI to Hume 2025-10-06 19:41:31 -07:00
ivaaan
73ffe96228 add RTVI to Hume 2025-10-06 19:37:05 -07:00
Aleix Conchillo Flaqué
bd13a80da7 pyproject: update google dependencies 2025-10-06 17:38:08 -07:00
Aleix Conchillo Flaqué
312959f97e GoogleLLMService: update default model to gemini-2.5-flash 2025-10-06 17:38:08 -07:00
Aleix Conchillo Flaqué
fe168e3c68 GoogleLLMService: added support for image generation 2025-10-06 17:38:08 -07:00
Filipi Fuchter
28929a47f7 Verifying WhatsApp signature to ensure the webhook request is from WhatsApp. 2025-10-06 16:16:59 -03:00
Mark Backman
03f5defbc3 Merge pull request #2793 from pipecat-ai/mb/fix-flux-deprecation
Fix: Resolve Flux deprecation warning
2025-10-06 12:07:27 -04:00
Mark Backman
b216648315 Fix: Resolve Flux deprecation warning 2025-10-06 09:55:02 -04:00
Mark Backman
084b133a01 Merge pull request #2790 from pipecat-ai/add-security-md
Add SECURITY.md
2025-10-06 09:45:02 -04:00
Mark Backman
e589876176 Merge pull request #2786 from pipecat-ai/mb/nltk-download-error
Catch PermissionError when NLTK data can't be downloaded
2025-10-06 09:27:22 -04:00
Vanessa Pyne
a826313bf9 Add SECURITY.md 2025-10-05 13:24:47 -05:00
Mark Backman
49f44aa7c8 Catch PermissionError when NLTK data can't be downloaded 2025-10-04 08:41:32 -04:00
Mark Backman
64ceef9cf0 Merge pull request #2783 from pipecat-ai/mb/community-integrations-submission
Update to Community Integrations submission process
2025-10-03 12:41:13 -04:00
Mark Backman
cd6567c1f1 Update to Community Integrations submission process 2025-10-03 12:15:48 -04:00
Mark Backman
ac67ca1555 Merge pull request #2778 from pipecat-ai/mb/hume-cleanup
Tidying up the Hume example and service
2025-10-03 11:09:18 -04:00
mattie ruth backman
8d38994756 Transports now send InputTransportMessageFrames (not Urgent Frames) 2025-10-03 09:47:44 -04:00
LucasStringPay
607e3040d4 Ignore None 'completion_tokens' or similar
Similar as 144ea36c81 , reported in https://github.com/pipecat-ai/pipecat/issues/2207
2025-10-02 15:16:11 -07:00
Mark Backman
60604a9449 Tidying up the Hume example and service 2025-10-02 17:34:40 -04:00
Aleix Conchillo Flaqué
4abe4a6253 Merge pull request #2777 from pipecat-ai/aleix/readme-mention-tail
README: add tail terminal dashboard
2025-10-02 14:31:26 -07:00
Aleix Conchillo Flaqué
4c054af17b README: remove setup editor instructions 2025-10-02 14:30:31 -07:00
Aleix Conchillo Flaqué
dcba940d42 README: add tail terminal dashboard 2025-10-02 14:27:55 -07:00
Mark Backman
ad2adb0c58 Merge pull request #2518 from zgreathouse/hume-tts-service
Hume tts service
2025-10-02 17:26:39 -04:00
ivaaan
76923010b5 upd Hume version to 2 2025-10-02 13:57:07 -07:00
ivaaan
1b511557b2 upd evals 2025-10-02 13:48:30 -07:00
ivaaan
fdadb12933 upd Changelog 2025-10-02 13:46:22 -07:00
ivaaan
f1bbb7ba22 Regenerate uv.lock after resolving merge conflicts 2025-10-02 13:44:07 -07:00
ivaaan
c1492c5275 fixes based on markbackman review 2025-10-02 13:38:36 -07:00
ivaaan
4ffdabcfde add Hume example, small fixes 2025-10-02 13:38:36 -07:00
zach
b489de2fc3 adds hume tts service 2025-10-02 13:38:05 -07:00
zach
d9656cbb1a add hume sdk for hume tts service 2025-10-02 13:38:05 -07:00
zach
05fb223985 Add hume to .env.example 2025-10-02 13:34:37 -07:00
Mark Backman
62a5f07ad2 Merge pull request #2701 from pipecat-ai/mb/third-party-integrations
Add a third-party integrations guide
2025-10-02 15:59:38 -04:00
Mark Backman
b669e3a481 Update name to Community Integrations and streamline guide 2025-10-02 15:54:04 -04:00
Mark Backman
99f1041a47 More review fixes 2025-10-02 14:48:12 -04:00
Mark Backman
37b1345bfa Changes from review feedback 2025-10-02 14:48:12 -04:00
Mark Backman
8994ac17eb Add a third-party integrations guide 2025-10-02 14:48:12 -04:00
Mark Backman
63bc825008 Merge pull request #2771 from pipecat-ai/mb/update-publish-workflows
Updates to publish workflows
2025-10-02 12:35:43 -04:00
Mark Backman
e7ffde1c4c Merge pull request #2774 from pipecat-ai/mb/docs-fixes-0.0.87
Fix: Resolve docstring build issues before 0.0.87 release
2025-10-02 12:34:27 -04:00
Mark Backman
1c88565725 Merge pull request #2772 from pipecat-ai/mb/fix-openai-realtime-import
Fix: Change import for OpenAIRealtimeLLMContext in OpenAIRealtimeLLMS…
2025-10-02 12:34:16 -04:00
Aleix Conchillo Flaqué
07a6c2fb0e Merge pull request #2775 from pipecat-ai/aleix/pipecat-0.0.87
update CHANGELOG for 0.0.87
2025-10-02 09:12:41 -07:00
Aleix Conchillo Flaqué
e99f3bf75a update CHANGELOG for 0.0.87 2025-10-02 09:11:30 -07:00
Mark Backman
f09d780413 Fix: Resolve docstring build issues before 0.0.87 release 2025-10-02 10:09:25 -04:00
Mark Backman
e370d23374 Fix: Change import for OpenAIRealtimeLLMContext in OpenAIRealtimeLLMService 2025-10-02 09:39:44 -04:00
Mark Backman
b68ec14146 Updates to publish workflows 2025-10-02 08:25:35 -04:00
Filipi da Silva Fuchter
c567fd71b1 Merge pull request #2747 from pipecat-ai/filipi/whatsapp_runner
Creating the whatsapp routes inside the runner.
2025-10-01 21:21:34 -03:00
Filipi da Silva Fuchter
2ca1b2d6f8 Merge pull request #2612 from pipecat-ai/filipi/deepgram_flux
Integrating the new Deepgram model (Flux) with Pipecat
2025-10-01 21:20:47 -03:00
Mark Backman
04041a9a9a Merge pull request #2757 from pipecat-ai/hush/retryTimeout
Fix AWS Bedrock timeout exception handling
2025-10-01 19:08:09 -04:00
Aleix Conchillo Flaqué
6c498dc70f Merge pull request #2745 from pipecat-ai/aleix/transport-message-frames-deprecations
transport message frames deprecations
2025-10-01 16:05:55 -07:00
James Hush
32b07c1720 Fix AWS Bedrock timeout exception handling
- Use ReadTimeoutError and asyncio.TimeoutError which are the actual exceptions thrown by boto3
2025-10-01 19:04:35 -04:00
Aleix Conchillo Flaqué
ad507ce23d FrameLogger: it's fine to print transport messages 2025-10-01 16:00:42 -07:00
Aleix Conchillo Flaqué
be562cedfc DailyTransport: deprecate DailyTransportMessage(Urgent)Frame 2025-10-01 16:00:42 -07:00
Aleix Conchillo Flaqué
089e703e1f LiveKitTransport: deprecate LiveKitTransportMessage(Urgent)Frame 2025-10-01 16:00:42 -07:00
Aleix Conchillo Flaqué
4dc1e15a99 frames: use OutputTransportMessage(Urgent)Frame instead of TransportMessage(Urgent)Frame 2025-10-01 16:00:42 -07:00
Aleix Conchillo Flaqué
c7dc2e886f frames: use InputTransportMessageFrame instead of InputTransportMessageUrgentFrame
By default, input frames are already urgent.
2025-10-01 15:30:45 -07:00
Filipi Fuchter
11bc4ea854 Adding deepgram flux to release evals. 2025-10-01 19:24:58 -03:00
Mark Backman
029d76033d Merge pull request #2765 from pipecat-ai/mb/remove-daily-logging-04a
Remove DailyLogLevel from 04a example
2025-10-01 17:52:33 -04:00
Aleix Conchillo Flaqué
924d7dea9a Merge pull request #2766 from pipecat-ai/aleix/rtvi-properly-deprecate-errors-enabled
RTVIParams: properly deprecate errors_enabled
2025-10-01 14:49:12 -07:00
Aleix Conchillo Flaqué
244e94f3ce RTVIParams: properly deprecate errors_enabled 2025-10-01 14:30:41 -07:00
Mark Backman
af1f51d49e Remove DailyLogLevel from 04a example 2025-10-01 17:06:35 -04:00
Filipi da Silva Fuchter
9ba3c168b8 Merge pull request #2756 from pipecat-ai/filipi/esp32
SDP munging fixes.
2025-10-01 16:05:47 -03:00
Filipi Fuchter
e6ee8f7a16 New example using DeepgramFluxSTTService. 2025-10-01 15:43:25 -03:00
Filipi Fuchter
2ea2bd99e0 Deepgram Flux speech-to-text service implementation. 2025-10-01 15:43:09 -03:00
Filipi Fuchter
0c2ced7c52 Created WebsocketSTTService base class. 2025-10-01 15:42:56 -03:00
Filipi Fuchter
fb160646b8 Fixing the SDP munging to keep it working on Chrome. 2025-10-01 14:18:39 -03:00
Filipi da Silva Fuchter
89fed57af2 Merge pull request #2748 from pipecat-ai/filipi/remove_smallwebrtc_queue
Removing the message queue inside the SmallWebRTCConnection.
2025-10-01 08:07:47 -03:00
Aleix Conchillo Flaqué
feae3b6d2d Merge pull request #2742 from pipecat-ai/aleix/deprecate-daily-update-remote-participants-frame
DailyTransport: deprecated DailyUpdateRemoteParticipantsFrame
2025-09-30 16:27:34 -07:00
Aleix Conchillo Flaqué
92d3be8975 DailyTransport: deprecated DailyUpdateRemoteParticipantsFrame 2025-09-30 16:26:48 -07:00
Aleix Conchillo Flaqué
0f53e1db2c Merge pull request #2759 from pipecat-ai/aleix/dont-cancel-if-finished
PipelineTask: avoid cancellation if application is finished
2025-09-30 16:21:16 -07:00
Aleix Conchillo Flaqué
d398e8cc10 Merge pull request #2761 from pipecat-ai/aleix/rtvi-tail-updates
RTVI updates: audio levels and system logs
2025-09-30 13:55:17 -07:00
Aleix Conchillo Flaqué
e5f263d380 update CHANGELOG 2025-09-30 13:51:35 -07:00
Aleix Conchillo Flaqué
3a4c303c54 RTVIParams: add errors_enabled deprecation warnings 2025-09-30 13:49:51 -07:00
Mark Backman
54a1ef47d0 Merge pull request #2758 from pipecat-ai/mb/claude-sonnet-4.5
Update AnthropicLLMService to use claude-sonnet-4-5-20250929
2025-09-30 16:42:47 -04:00
Aleix Conchillo Flaqué
149ffa4f3c RTVIObserver: add support system logs 2025-09-30 13:42:40 -07:00
Aleix Conchillo Flaqué
e5465034d9 RTVIObserver: add support for user/bot audio levels 2025-09-30 13:41:26 -07:00
Aleix Conchillo Flaqué
568c7c782d rtvi: allow None RTVIProcessor and rename to send_rtvi_message() 2025-09-30 13:35:27 -07:00
Aleix Conchillo Flaqué
9851334221 rtvi: deprecate errors_enabled and always send errors 2025-09-30 13:31:30 -07:00
Aleix Conchillo Flaqué
e79c4fc99d PipelineTask: avoid cancellation if application is finished 2025-09-30 13:18:25 -07:00
Aleix Conchillo Flaqué
55c321f4ff Merge pull request #2751 from pipecat-ai/aleix/nova-sonic-disconnect-fix
AWSNovaSonicLLMService: add missing await
2025-09-30 13:12:22 -07:00
kompfner
a14a53a005 Merge pull request #2735 from pipecat-ai/pk/remove-openaillmcontext-usage
Remove remaining usage of `OpenAILLMContext` throughout the codebase …
2025-09-30 10:09:25 -04:00
Mark Backman
a71f937e8f Update AnthropicLLMService to use claude-sonnet-4-5-20250929 2025-09-30 08:49:30 -04:00
Filipi Fuchter
032032df65 Only remove ESP32 ICE candidates if host is defined. 2025-09-29 15:42:23 -03:00
Mark Backman
d0178edad0 Merge pull request #2753 from pipecat-ai/mb/quickstart-0.0.86
Quickstart: Update to 0.0.86, removing pytorch requirements
2025-09-29 09:43:33 -04:00
Mark Backman
795c5e55d9 Quickstart: Update to 0.0.86, removing pytorch requirements 2025-09-27 08:30:37 -04:00
Aleix Conchillo Flaqué
8f8d8ae0d8 AWSNovaSonicLLMService: add missing await 2025-09-26 15:58:05 -07:00
Vanessa Pyne
741f192d04 Merge pull request #2096 from pipecat-ai/vp-mcp-ex-nit
mcp examples: check for env vars needed for examples
2025-09-26 10:21:22 -05:00
Filipi Fuchter
a5595b82ea removing the message queue inside the SmallWebRTCConnection. 2025-09-26 11:02:17 -03:00
Filipi Fuchter
4d1915eb41 Fixing ruff format. 2025-09-26 10:49:52 -03:00
Filipi Fuchter
b3a84fc772 Refactoring how we are handling the lifespan inside the runner. 2025-09-26 10:47:04 -03:00
Filipi Fuchter
403d22e62c Creating the whatsapp routes inside the runner. 2025-09-26 10:28:19 -03:00
Aleix Conchillo Flaqué
ee00ee5c57 Merge pull request #2744 from pipecat-ai/aleix/vad-analyzer-thread-executor
BaseInputTransport: create VAD thread in VADAnalyzer
2025-09-25 13:43:34 -07:00
Aleix Conchillo Flaqué
f53fd880dc BaseInputTransport: create VAD thread in VADAnalyzer
We move the thread creation to the VADAnalyzer instead of the input
transport. This can potentially be useful if we need to analyze multiple audio
streams.
2025-09-25 13:41:20 -07:00
Aleix Conchillo Flaqué
de3461e4cc Merge pull request #2743 from pipecat-ai/aleix/turn-analyzer-fixes
turn analyzer fixes
2025-09-25 13:40:43 -07:00
Aleix Conchillo Flaqué
7bafc3a1bb BaseSmartTurn: process speech in a separate thread 2025-09-25 13:37:28 -07:00
Aleix Conchillo Flaqué
22ef61fe8d BaseTurnAnalyzer: add BaseTurnParams base class for parameters 2025-09-25 13:37:09 -07:00
Aleix Conchillo Flaqué
7078fb53bd Merge pull request #2738 from pipecat-ai/aleix/openai-cached-tokens-metrics
BaseOpenAILLMService: include cached tokens to metrics frame
2025-09-25 13:36:03 -07:00
Aleix Conchillo Flaqué
33447ad6f2 BaseOpenAILLMService: include cached tokens to metrics frame 2025-09-24 19:32:16 -07:00
Paul Kompfner
6faa50ae5b Remove remaining usage of OpenAILLMContext throughout the codebase in favor of LLMContext, except for:
- Usage in classes that are already deprecated
- Usage related to realtime LLMs, which don't yet support `LLMContext`
- Usage in (soon-to-be-deprecated) code paths related to `OpenAILLMContext` itself and associated machinery
2025-09-24 16:35:03 -04:00
Aleix Conchillo Flaqué
3797f41c8c Merge pull request #2734 from pipecat-ai/aleix/pipecat-0.0.86
update CHANGELOG for 0.0.86
2025-09-24 12:19:16 -07:00
Aleix Conchillo Flaqué
ff919b8c15 update CHANGELOG for 0.0.86 2025-09-24 11:28:14 -07:00
Aleix Conchillo Flaqué
cb048d6c7e Merge pull request #2733 from pipecat-ai/aleix/tavus-missing-daily-callback
TavusTransport: add missing on_before_leave callback
2025-09-24 11:10:32 -07:00
Aleix Conchillo Flaqué
6c2c43ade0 Merge pull request #2724 from pipecat-ai/pk/update-natural-conversation-examples-with-universal-context
Update natural conversation examples with universal context
2025-09-24 11:07:50 -07:00
Aleix Conchillo Flaqué
f899c15b03 Merge pull request #2731 from pipecat-ai/pk/update-example-25-to-use-universal-context
Update example 25 to use universal `LLMContext`
2025-09-24 11:05:36 -07:00
Aleix Conchillo Flaqué
d10ef08775 Merge pull request #2727 from pipecat-ai/pk/strands-agents-needs-to-support-openaillmcontextframe-for-now
`StrandsAgentsProcessor` should still support `OpenAILLMContextFrame`…
2025-09-24 11:05:07 -07:00
Aleix Conchillo Flaqué
27a5af6fa1 Merge pull request #2728 from pipecat-ai/pk/fix-playht-in-env-example
Fix PlayHT env variable names in env.example
2025-09-24 11:04:55 -07:00
Aleix Conchillo Flaqué
4bff0a7c49 Merge pull request #2732 from pipecat-ai/pk/update-voicemail-detector-to-use-llm-context
Update `VoicemailDetector` to use universal `LLMContext`
2025-09-24 11:04:42 -07:00
Aleix Conchillo Flaqué
508f7d203d Merge pull request #2729 from pipecat-ai/aleix/frame-processor-cancel-default-timeout
FrameProcessor: timeout when cancelling tasks
2025-09-24 10:55:52 -07:00
Filipi da Silva Fuchter
0f87d5342c Merge pull request #2266 from pipecat-ai/filipi/hey_gen_transport
HeyGen implementation for Pipecat - HeyGenTransport
2025-09-24 14:35:50 -03:00
Filipi Fuchter
f6164e3bde HeyGen implementation for Pipecat - HeyGenTransport. 2025-09-24 14:33:56 -03:00
Aleix Conchillo Flaqué
1a0fb55d0f TavusTransport: add missing on_before_leave callback 2025-09-24 10:18:56 -07:00
Paul Kompfner
6d0beef944 Update VoicemailDetector to use universal LLMContext 2025-09-24 12:58:42 -04:00
Paul Kompfner
b9fd6b873b Update comment in example 07b to reference LLMContext rather than OpenAILLMContext 2025-09-24 12:49:34 -04:00
Paul Kompfner
dea0f1791f Update OpenAILLMAdapter.get_messages_for_logging() to truncate "input_audio" message data 2025-09-24 12:41:41 -04:00
Paul Kompfner
da66c38795 Update example 25 to use universal LLMContext 2025-09-24 12:37:29 -04:00
Paul Kompfner
912f8b96f0 Fix PlayHT env variable names in env.example 2025-09-24 11:24:35 -04:00
Aleix Conchillo Flaqué
f9eb447d82 FrameProcessor: timeout when cancelling tasks 2025-09-24 08:24:28 -07:00
Paul Kompfner
65f5fe8588 StrandsAgentsProcessor should still support OpenAILLMContextFrame until that frame has been deprecated 2025-09-24 11:05:27 -04:00
Paul Kompfner
817c77f3fe Update SmallWebRTCTransport to pass a sender ID in the "on_app_message" event 2025-09-24 10:24:59 -04:00
Paul Kompfner
8896179b00 Update another "natural conversation" example to use universal LLMContext. Note that this one had to also be fixed in various ways, as it wasn't working. 2025-09-24 09:55:11 -04:00
Filipi da Silva Fuchter
463752360b Merge pull request #2726 from pipecat-ai/mb/twilio-serializer-cleanup
Fixup for TwilioFrameSerializer
2025-09-24 10:45:38 -03:00
Paul Kompfner
66b7977a62 Make SmallWebRTCTransport adhere to the expected "on_app_message" event signature 2025-09-24 09:34:28 -04:00
Mark Backman
468de68aec Fixup for TwilioFrameSerializer 2025-09-24 09:32:46 -04:00
Mark Backman
c4762c1a92 Merge pull request #2627 from jessieweiyi/main
Support TwilioFrameSerializer region/edge settings. Close #2625.
2025-09-24 09:13:10 -04:00
Aleix Conchillo Flaqué
7f4d3a2f02 pyproject: updated sentry to 2.38.0 2025-09-23 19:12:03 -07:00
Jessie Wei
88614b312f Merge branch 'pipecat-ai:main' into main 2025-09-24 10:23:52 +10:00
Jessie Wei
5b4655f45a chore: Update per review comments 2025-09-24 00:22:56 +00:00
Aleix Conchillo Flaqué
d7c8f8df53 update CHANGELOG with AudioBufferProcessor fixes 2025-09-23 15:42:01 -07:00
Aleix Conchillo Flaqué
2571cb2e69 tests: fix formatting 2025-09-23 15:28:07 -07:00
Aleix Conchillo Flaqué
15782be27c Merge pull request #2676 from golbin/main
Fix audio buffer flush and silence handling
2025-09-23 15:27:31 -07:00
Aleix Conchillo Flaqué
997e4b66c6 Merge pull request #2722 from pipecat-ai/aleix/strands-agents-update-and-evals
examples: update Strands Agents with universal context and add evals
2025-09-23 15:21:41 -07:00
Paul Kompfner
6ccbfd9b57 Update "natural conversation" examples to use universal LLMContext 2025-09-23 16:20:16 -04:00
Paul Kompfner
677f69971c Add filters in 22b and 22c examples to prevent function call results triggering the "statement judge" LLM from running unnecessarily, and with the wrong system prompt, which would result in garbled output statements comprised of both LLMs outputs combined 2025-09-23 16:14:58 -04:00
Paul Kompfner
678dd22b8e Add missing sender argument to a few "on_app_message" handlers in examples 2025-09-23 15:29:20 -04:00
kompfner
0bba02028d Merge pull request #2721 from pipecat-ai/pk/update-persistent-storage-examples-to-use-universal-llmcontext
Update persistent conversation storage examples to use universal `LLM…
2025-09-23 15:18:21 -04:00
Aleix Conchillo Flaqué
620b1f785c examples: update Strands Agents with universal context and add evals 2025-09-23 11:37:57 -07:00
Paul Kompfner
780e91eb91 Update persistent conversation storage examples to use universal LLMContext.
Note that `LLMContext` doesn't have a `get_messages_for_persistent_storage()`; the messages are already in the "standard" format so they can be used directly for storage.
2025-09-23 14:16:35 -04:00
Aleix Conchillo Flaqué
667569ef47 Merge pull request #2708 from pipecat-ai/aleix/base-output-transport-only-push-if-send-successful
BaseOutputTransport: only push downstream if transport write successful
2025-09-23 11:10:29 -07:00
Aleix Conchillo Flaqué
17ea0afa6f StrandsAgentsProcessor: more formatting fixes 2025-09-23 11:05:14 -07:00
Aleix Conchillo Flaqué
3fc5214c15 BaseOutputTransport: only push downstream if transport write successful
Fixes #2589
2025-09-23 11:04:37 -07:00
kompfner
1636c48ab9 Merge pull request #2720 from pipecat-ai/pk/update-changelog-with-additional-llmcontext-support
Update CHANGELOG with additional `LLMContext` support
2025-09-23 13:46:20 -04:00
Paul Kompfner
c3a2fa100c Update CHANGELOG with additional LLMContext support 2025-09-23 13:45:49 -04:00
kompfner
8649368337 Merge pull request #2719 from pipecat-ai/pk/update-more-examples-to-use-universal-llmcontext
Update more examples to use universal `LLMContext`. Specifically, upd…
2025-09-23 13:44:20 -04:00
Aleix Conchillo Flaqué
781366627c updated CHANGELOG with Strands Agents 2025-09-23 10:41:35 -07:00
Aleix Conchillo Flaqué
f6b4db42ef pyproject: add strands-agents 2025-09-23 10:39:06 -07:00
Aleix Conchillo Flaqué
ed64716219 StrandsAgentsProcessor: fix formatting 2025-09-23 10:38:31 -07:00
Aleix Conchillo Flaqué
5c22b2e1de Merge pull request #2610 from adithyaxx/add-strands-processor
Add native Strands Agents support to Pipecat
2025-09-23 10:33:22 -07:00
Paul Kompfner
d4b1e1ab41 Update more examples to use universal LLMContext. Specifically, update examples we didn't update before because they weren't using ToolsSchema for their tool definitions, which is a requirement for using LLMContext.
NOTE: oops! Turns out some of these files had *already* been updated to use universal `LLMContext` even though they weren't yet using `ToolsSchema`. This commit should fix those examples.
2025-09-23 12:41:35 -04:00
Mark Backman
fafe0cc4a3 Merge pull request #2718 from lshaun/fix-openai-deprecation-warning
update imports to avoid deprecated module
2025-09-23 12:29:16 -04:00
Mark Backman
40c82a8530 Merge pull request #2716 from pipecat-ai/mb/add-11labs-stt
Add ElevenLabsSTTService
2025-09-23 12:21:08 -04:00
lshaun
98d3686861 update imports to avoid deprecated module 2025-09-23 15:58:09 +00:00
kompfner
88337fc21f Merge pull request #2717 from pipecat-ai/pk/mem0-support-univeral-context
Add support for universal `LLMContext` to `Mem0MemoryService`
2025-09-23 11:54:55 -04:00
kompfner
928c0ef1b4 Merge pull request #2715 from pipecat-ai/pk/langchain-processor-support-univeral-context
Add support for universal `LLMContext` to `LangchainProcessor`
2025-09-23 11:54:44 -04:00
kompfner
1f005e7075 Merge pull request #2714 from pipecat-ai/pk/gated-openai-llm-context-aggregator-support-univeral-context
Add support for universal `LLMContext` to `GatedOpenAILLMContextAggre…
2025-09-23 11:54:26 -04:00
kompfner
2cee6229ae Merge pull request #2713 from pipecat-ai/pk/log-observer-support-univeral-context
Add support for universal `LLMContext` to `LLMLogObserver`
2025-09-23 11:54:11 -04:00
Aleix Conchillo Flaqué
10e9371f49 Merge pull request #2712 from pipecat-ai/aleix/pipeline-runner-signals-windows
PipelineRunner: use signal.signal() on Windows
2025-09-23 08:28:26 -07:00
Paul Kompfner
e21ab89509 Add support for universal LLMContext to Mem0MemoryService 2025-09-23 10:35:54 -04:00
Mark Backman
cbce2075eb Add ElevenLabsSTTService 2025-09-23 10:27:31 -04:00
Paul Kompfner
97868175e6 Add support for universal LLMContext to LangchainProcessor 2025-09-23 10:00:48 -04:00
Paul Kompfner
99731ca40a Add support for universal LLMContext to GatedOpenAILLMContextAggregator, renaming it to GatedLLMContextAggregator in the process 2025-09-23 09:52:13 -04:00
Paul Kompfner
f96cbcce22 Add support for universal LLMContext to LLMLogObserver 2025-09-23 09:30:14 -04:00
kompfner
a19b9f70c0 Merge pull request #2706 from pipecat-ai/pk/update-examples-to-use-universal-llm-context
Update examples, wherever possible, to use `LLMContext` and associate…
2025-09-23 09:20:39 -04:00
Filipi da Silva Fuchter
9b4f1bdf39 Merge pull request #2705 from tzookb/tzookb/latency-logging
update UserBotLatencyLogObserver to have logging in functions that can be overidden
2025-09-23 09:46:20 -03:00
Filipi Fuchter
6b2bf8de64 Fixing the ruff format and making the methods sync. 2025-09-23 09:41:50 -03:00
Filipi da Silva Fuchter
33481c6614 Merge pull request #2672 from pipecat-ai/filipi/pcc_small_webrtc_2
Monitoring the peer connection while it is in the *connecting* state.
2025-09-23 08:32:16 -03:00
Filipi Fuchter
a5776b20ad Monitoring the peer connection while it is in the *connecting* state. 2025-09-23 08:30:17 -03:00
Filipi da Silva Fuchter
e286e015cf Merge pull request #2687 from pipecat-ai/memory_leak
Improving memory cleanup
2025-09-23 08:13:05 -03:00
Filipi Fuchter
a7bfac8d68 Mentioning the memory cleanups in the changelog. 2025-09-23 08:09:31 -03:00
Filipi Fuchter
1647b5b665 Created a new example using the video processor to make it easier to investigate memory leaks. 2025-09-23 08:05:28 -03:00
Filipi Fuchter
eaecefe675 Refactoring how we are reading the image bytes inside the base llm. 2025-09-23 08:05:15 -03:00
Filipi Fuchter
7c569b3863 Calling task_done when reading the audio from the queue. 2025-09-23 08:04:06 -03:00
Filipi Fuchter
8bf6a4c66f Improving memory cleanup inside the SmallWebRTCTransport. 2025-09-23 08:03:55 -03:00
Filipi Fuchter
1df3660186 Not storing anymore the last frames received to display them in the idle processor. 2025-09-23 08:03:35 -03:00
Aleix Conchillo Flaqué
75c0b089e0 PipelineRunner: use signal.signal() on Windows 2025-09-22 22:38:12 -07:00
Aleix Conchillo Flaqué
d8f3d4dd32 Merge pull request #1874 from nischalj10/patch-1
added deepwiki badge for weekly repo refresh
2025-09-22 17:44:08 -07:00
Aleix Conchillo Flaqué
c5e53bb84f Merge pull request #2707 from pipecat-ai/aleix/daily-transport-deprecated-mistake
DailyTransport: remove deprecated note and double registration
2025-09-22 17:01:09 -07:00
Aleix Conchillo Flaqué
b04e494373 DailyTransport: remove deprecated note and double registration 2025-09-22 16:45:58 -07:00
Jessie Wei
392293d55f Merge branch 'pipecat-ai:main' into main 2025-09-23 07:48:45 +10:00
Paul Kompfner
272532a3ea Update examples, wherever possible, to use LLMContext and associated machinery instead of OpenAILLMContext and associated machinery.
With all these examples updated, we no longer need dedicated examples illustrating `LLMContext`, so they're removed.

Here’s where we *don’t* yet use `LLMContext` and associated machinery:
- Realtime services: OpenAI Realtime, Gemini Live, and AWS Nova Sonic (support coming soon)
- `GoogleLLMOpenAIBetaService` (it’s deprecated, so we didn’t bother adding support)
- `LLMLogObserver` (support coming soon)
- `GatedOpenAILLMContextAggregator` (support coming soon)
- `LangchainProcessor` (support coming soon)
- `Mem0MemoryService` (support coming soon)
- Examples that use LLM-specific tools definitions as opposed to `ToolsSchema` (these will be updated soon)
- Examples that rely `GoogleLLMContext.upgrade_to_google` (TBD what to do with these)

Examples that use `LLMLogObserver`:
- 30-

Examples that use `GatedOpenAILLMContextAggregator`:
- 22-

Examples that use `LangchainProcessor`:
- 07b-

Examples that use `Mem0MemoryService`:
- 37-

Examples that need updating to use `ToolsSchema`:
- 15-
- 15a-
- 20a-
- 20c-
- 20d-
- 22b-
- 22c-
- 33-
- 36-

Examples that use `GoogleLLMContext.upgrade_to_google`:
- 22d-
- 25-
2025-09-22 16:21:35 -04:00
Tzook Bar Noy
3d04f565ec update latency observer logging to be in unique funcs, so others could extend and overwrite 2025-09-22 15:38:29 -04:00
Filipi da Silva Fuchter
d0477edb6a Merge pull request #2696 from pipecat-ai/filipi/inworld_default_temperature
Changing InworldTTSService default temperature to 1.1
2025-09-22 09:33:52 -03:00
Filipi Fuchter
326bfe4239 Removing the temperature from InworldTTSService example. 2025-09-22 09:30:53 -03:00
Aleix Conchillo Flaqué
3cb78d839d examples(foundational): update comment in 45-before-and-afet-events.py 2025-09-20 10:33:52 -07:00
Aleix Conchillo Flaqué
9129e44c05 Merge pull request #2697 from pipecat-ai/aleix/frame-processor-before-after-events
FrameProcessor: add before/after events for processed/pushed frames
2025-09-20 10:26:37 -07:00
Aleix Conchillo Flaqué
ec664e2d33 examples(foundational): added 45-before-and-afet-events.py 2025-09-20 10:23:49 -07:00
Aleix Conchillo Flaqué
3d88b42e0b FrameProcessor: add before/after events for processed/pushed frames 2025-09-19 20:47:21 -07:00
Aleix Conchillo Flaqué
2289409b4c Merge pull request #2699 from pipecat-ai/aleix/daily-on-before-leave
DailyTransport: rename on_before_leave to on_before_disconnect
2025-09-19 20:38:46 -07:00
Aleix Conchillo Flaqué
b1551b0d6b DailyTransport: rename on_before_leave to on_before_disconnect 2025-09-19 19:21:31 -07:00
Mark Backman
7cb5c951f4 Merge pull request #2689 from pipecat-ai/mb/foundational-smart-turn
Update quickstart and foundational examples to use smart-turn v3
2025-09-19 15:39:14 -07:00
mattie ruth backman
a2cb5ab8e1 Add Changelong entry 2025-09-19 17:46:29 -04:00
mattie ruth backman
cad3104d56 Fix _handle_send_text to use default options if not provided 2025-09-19 17:46:29 -04:00
mattie ruth backman
d8a2a917a2 Deprecate append-to-text handling in favor of new send-text with support for toggling skip-tts while handling the text 2025-09-19 17:46:29 -04:00
chadbailey59
3984cb58a2 cleared input and process events when pausing (#2698) 2025-09-19 16:44:43 -05:00
chadbailey59
9027a96a07 Add remote participant updates to DailyTransport (#2694)
* add remote participant updates to DailyTransport

* cleanup

* cleanup

* ruff cleanup again
2025-09-19 16:03:34 -05:00
Aleix Conchillo Flaqué
6abac3e3e5 Merge pull request #2673 from pipecat-ai/aleix/sync-event-handlers
introduce synchronous event handlers
2025-09-19 13:40:43 -07:00
Aleix Conchillo Flaqué
077b949bb2 BaseObject: run each handler for the same event in a separate task 2025-09-19 13:32:39 -07:00
Aleix Conchillo Flaqué
3a6c9786e8 LiveKitTransport: added synchronous before_disconnect event 2025-09-19 13:32:39 -07:00
Aleix Conchillo Flaqué
492da16cc9 DailyTransport: added synchronous on_before_disconnect event 2025-09-19 13:32:39 -07:00
Aleix Conchillo Flaqué
a698c4064b BaseObject: allow synchronous event handlers 2025-09-19 13:32:39 -07:00
Filipi Fuchter
9e098b5f79 Changing InworldTTSService default temperature to 1.1 2025-09-19 17:03:53 -03:00
Mark Backman
7df7395dd1 Merge pull request #2692 from pipecat-ai/mb/lazy-load-smallwebrtc-request
Lazy load SmallWebRTCRequest, SmallWebRTCRequestHandler in runner
2025-09-19 10:43:43 -07:00
Mark Backman
0885bc9cdf Lazy load SmallWebRTCRequest, SmallWebRTCRequestHandler in runner 2025-09-19 13:28:01 -04:00
vipyne
889dc19a27 mcp examples: check for env vars needed for examples 2025-09-19 12:09:50 -05:00
Mark Backman
4bc41466b7 Bump quickstart pipecat-ai version to require smart-turn v3, add local smart-turn dep, update quickstart 2025-09-19 08:03:06 -04:00
Mark Backman
9ab8ddee79 Update quickstart and foundational examples to use smart-turn v3 2025-09-18 23:54:18 -04:00
Aleix Conchillo Flaqué
0204f6a95d Merge pull request #2686 from pipecat-ai/aleix/silero-vad-v6
audio(vad): update Silero VAD model to v6
2025-09-18 20:31:10 -07:00
Mark Backman
b0bf653f04 Merge pull request #2679 from pipecat-ai/mb/gladia-remove-confidence
GladiaSTTService: deprecate confidence arg
2025-09-18 17:41:33 -07:00
Mark Backman
e8a676eb36 GladiaSTTService: deprecate confidence arg 2025-09-18 20:38:53 -04:00
Mark Backman
ca96eef1f3 Merge pull request #2680 from pipecat-ai/mb/dial-in-session-id
DailyTransport sip_call_transfer now automatically receives session_id
2025-09-18 17:36:51 -07:00
Mark Backman
8e1637d6c7 DailyTransport sip_call_transfer now automatically receives session_id 2025-09-18 20:34:14 -04:00
Filipi da Silva Fuchter
367200c0ad Merge pull request #2682 from pipecat-ai/filipi/smallwebrtc_leak
Smallwebrtc memory leak
2025-09-18 18:56:08 -03:00
Filipi Fuchter
766e1948a6 Mentioning the fix in the changelog. 2025-09-18 18:43:33 -03:00
Aleix Conchillo Flaqué
f369683b8b audio(vad): update Silero VAD model to v6 2025-09-18 14:06:37 -07:00
Aleix Conchillo Flaqué
461025d1cc Merge pull request #2684 from pipecat-ai/aleix/readme-whisker
README: add whisker debugger
2025-09-18 13:27:35 -07:00
Aleix Conchillo Flaqué
ac88706f38 README: add whisker debugger 2025-09-18 13:22:54 -07:00
Filipi Fuchter
93a89449b8 Adding warnings in case queue grows. 2025-09-18 16:43:57 -03:00
Filipi Fuchter
199bf72945 Preventing memory growth if we are not consuming the track. 2025-09-18 16:16:10 -03:00
Filipi Fuchter
d20e4125f6 Updating aiortc to the latest version. 2025-09-18 15:22:46 -03:00
Filipi Fuchter
c1baed642e Script to monitor memory usage. 2025-09-18 14:43:42 -03:00
Mark Backman
33ef68573f Merge pull request #2662 from pelguetat/fix-vertex-ai-global-location-support
feat: add support for global location in Vertex AI base URL
2025-09-18 10:25:10 -07:00
Pablo Elgueta
3c1b41df13 docs: add changelog entry for global location support
- Document the new global location support in GoogleVertexLLMService
- Explain the difference between regional and global API hosts
- Follow Keep a Changelog format
2025-09-18 17:39:03 +01:00
Jin Kim
58f70e7e0d Add tests for audio buffer processor flush alignment 2025-09-18 22:19:32 +09:00
kompfner
fca4ecc73c Merge pull request #2675 from pipecat-ai/pk/service-switcher-logic-simplification
Simplify a bit of logic in `ServiceSwitcher`
2025-09-18 09:17:22 -04:00
Jin Kim
d0b573e44f Fix audio buffer flush and silence handling 2025-09-18 19:40:45 +09:00
Paul Kompfner
cfa333508b Simplify a bit of logic in ServiceSwitcher 2025-09-17 21:03:38 -04:00
Mark Backman
9e7260393a Merge pull request #2671 from pipecat-ai/mb/fix-asyncai-ttstextframe
fix: AsyncAITTSService wasn't pushing TTSTextFrames
2025-09-17 14:06:41 -07:00
Mark Backman
073b585c52 fix: AsyncAITTSService wasn't pushing TTSTextFrames 2025-09-17 16:54:18 -04:00
Aleix Conchillo Flaqué
81c2e51bec Merge pull request #2669 from pipecat-ai/aleix/interruption-task-frame-wait-fixes
interruption task frame wait fixes
2025-09-17 13:47:57 -07:00
Aleix Conchillo Flaqué
42344125b1 tests: add unit tests for push_interruption_task_frame_and_wait() 2025-09-17 13:38:22 -07:00
Aleix Conchillo Flaqué
db5bcfaa51 FrameProcessor: fix push_interruption_task_frame_and_wait() 2025-09-17 13:38:21 -07:00
kompfner
615239b7d2 Merge pull request #2646 from pipecat-ai/pk/service-switcher-unit-tests
`ServiceSwitcher` unit tests (ended up being much more than that)
2025-09-17 16:30:18 -04:00
Paul Kompfner
27f1e9dd69 Update CHANGELOG with a description of the recently-fixed ServiceSwitcher bugs 2025-09-17 16:27:12 -04:00
Paul Kompfner
bd760deff2 Update comment with more detail for posterity 2025-09-17 16:19:31 -04:00
Paul Kompfner
8bc3c89140 Fix a bug preventing usage of multiple ServiceSwitchers in a pipeline 2025-09-17 16:09:18 -04:00
Paul Kompfner
2cd2567a37 Add a unit tests validating that multiple ServiceSwitchers can be used in the same pipeline (currently failing) 2025-09-17 16:04:30 -04:00
Paul Kompfner
5b55988846 Denote a couple of variables are private with a leading underscore 2025-09-17 15:38:28 -04:00
Paul Kompfner
a12392182c Simplify, undoing the change allowing controlling ServiceSwitcher with immediate frames (SystemFrames). Service switcher frames are ControlFrames, which are easier to reason about. We can always build the immediate option later if needed (i.e. if there's sufficient user pull for it) 2025-09-17 15:35:02 -04:00
Paul Kompfner
b814b70e1e Allow controlling ServiceSwitcher with either immediate frames (SystemFrames) or in-order frames (ControlFrames).
Immediate is the "default", i.e. has the more obvious name (e.g. `ManuallySwitchServiceFrame` v `ManuallySwitchServiceControlFrame`), since that's *probably* what users will want to reach for. Also, the immediate frames are more likely to behave like what we had before the last few commits, where the service switch would always "jump the queue" by having an immediate effect once it hit the `ServiceSwitcher` in the pipeline, jumping ahead of frames in front of it destined for the service.
2025-09-17 15:35:02 -04:00
Paul Kompfner
a1f84e1b50 Remove extraneous unit tests 2025-09-17 15:35:02 -04:00
Paul Kompfner
0839b48da8 Fix an issue where the upstream ServiceSwitcherFilter wouldn't get updated with the current active service 2025-09-17 15:35:02 -04:00
Paul Kompfner
de51637b77 Update ServiceSwitcher so that ServiceSwitcherFrames (which might update the currently active service) are processed and have an effect at the expected time. We should be able to, for example, queue:
- A text frame
- A `ManuallySwitchServiceFrame` (which is a `ServiceSwitcherFrame`)
- Another text frame

And expect that the first text frame be handled by the initially active service and the second text frame be handled by the newly active one.

Previously, the `ManuallySwitchServiceFrame` would have an effect too early, causing both text frames to be handled by the newly active service. Why? Because the frame filtering condition was being updated *directly* by the `ServiceSwitcher`, which is upstream from the services it's switching between. It could therefore update the filters *before* the services received the prior frames.
2025-09-17 15:35:02 -04:00
Paul Kompfner
e1b1dc16ec Add unit tests for ServiceSwitcher 2025-09-17 15:35:02 -04:00
Mark Backman
1fe27eb0a2 Merge pull request #2660 from pipecat-ai/mb/fix-user-idle-processor-cancel-task
fix: clean up how UserIdleProcessor handles return False
2025-09-16 14:48:59 -07:00
Mark Backman
d7e1389497 fix: clean up how UserIdleProcessor handles return False 2025-09-16 17:44:06 -04:00
Aleix Conchillo Flaqué
8c7230aa8f Merge pull request #2668 from pipecat-ai/aleix/livekit-update
livekit package update
2025-09-16 14:43:18 -07:00
Aleix Conchillo Flaqué
2cf71239b0 examples(01b): use TTSSpeakFrame instead of TextFrame 2025-09-16 17:18:45 -04:00
Aleix Conchillo Flaqué
ec2c62e32b pyproject: update to livekit 1.0.13
Fixes #2643
2025-09-16 17:18:44 -04:00
Mark Backman
38ce85e9a0 Merge pull request #2667 from zytegalaxy/mcp-serverparameters-typefix
fix: replace `Tuple` type with `TypeAlias` for server params in MCP client
2025-09-16 14:14:59 -07:00
Mark Backman
2279e5a899 Merge pull request #2663 from pipecat-ai/mb/websockets-15
Add support for websockets 15.0
2025-09-16 14:08:36 -07:00
Mark Backman
cce6eb5d87 Merge pull request #2666 from pipecat-ai/mb/update-38b-local-turn-model
38b: Update bundled ONNX smart-turn model
2025-09-16 14:05:12 -07:00
mehrdad
c2b98ae557 fix(lint): fix space format issue 2025-09-16 13:44:15 -07:00
Filipi da Silva Fuchter
727eb12b16 Merge pull request #2648 from pipecat-ai/filipi/pcc_small_webrtc
Creating SmallWebRTCRequestHandler for managing peer connections.
2025-09-16 16:37:04 -03:00
mehrdad
ba96bd05d3 fix: replace Tuple type with TypeAlias for server params in MCP client 2025-09-16 11:44:25 -07:00
Mark Backman
8ead309f8d 38b: Update bundled ONNX smart-turn model 2025-09-16 13:17:14 -04:00
Mark Backman
fad0e55c64 Add websockets-base optional dependency and use for DRY pyproject.toml 2025-09-16 11:24:38 -04:00
Mark Backman
74b1af56a0 Update uv.lock 2025-09-16 11:21:49 -04:00
Mark Backman
6924850ec4 Add support for websockets 15.0 2025-09-16 11:21:49 -04:00
marcus-daily
dfe7815dc5 Smart Turn v3: removing torch and torchaudio deps 2025-09-16 16:02:41 +01:00
Pablo Elgueta
69f0a75882 feat: add support for global location in Vertex AI base URL
- Update _get_base_url method to handle 'global' location case
- Use 'aiplatform.googleapis.com' for global locations
- Use '{location}-aiplatform.googleapis.com' for regional locations
- Maintains backward compatibility with existing regional endpoints
2025-09-16 10:28:22 -03:00
Mark Backman
cca90791c4 Merge pull request #2652 from pipecat-ai/mb/fix-audio-buffer-processor-has-audio
fix: AudioBufferProcessor has_audio returns based on user or bot audi…
2025-09-15 18:43:59 -07:00
Mark Backman
f2a5d408de fix: AudioBufferProcessor has_audio returns based on user or bot audio existing 2025-09-15 21:35:35 -04:00
Aleix Conchillo Flaqué
044c6eba46 Merge pull request #2655 from pipecat-ai/aleix/add-on-pipeline-finalized
PipelineTask: add on_pipeline_finished event
2025-09-15 15:32:04 -07:00
Aleix Conchillo Flaqué
db71089f5e PipelineTask: add on_pipeline_finished event
This deprecates `on_pipeline_stopped`, `on_pipeline_ended` and
`on_pipeline_cancelled`.
2025-09-15 15:28:33 -07:00
Aleix Conchillo Flaqué
f861f5066f Merge pull request #2654 from pipecat-ai/aleix/unify-on-client-disconnected
transports: on_client_disconnected only if remote client disconnects
2025-09-15 15:18:57 -07:00
kompfner
81cede2c60 Merge pull request #2653 from pipecat-ai/pk/llm-context-adapting-tests
`LLMContext`-adapting unit tests
2025-09-15 16:38:46 -04:00
kompfner
7603203230 Merge pull request #2644 from pipecat-ai/pk/run-inference-unit-tests
`run_inference` unit tests
2025-09-15 16:26:10 -04:00
Aleix Conchillo Flaqué
8569b61598 transports: on_client_disconnected only if remote client disconnects 2025-09-15 11:35:40 -07:00
Paul Kompfner
fe42187dc1 Implement LLMService.create_llm_specific_message() so that users don't need to just know what value of llm to provide to the LLMSpecificMessage constructor 2025-09-15 14:15:22 -04:00
Paul Kompfner
999e88c942 Add unit tests for AWSBedrockLLMAdapter.get_llm_invocation_params(), focusing on messages specifically 2025-09-15 12:08:21 -04:00
Paul Kompfner
c04df2f28b Add unit tests for AnthropicLLMAdapter.get_llm_invocation_params(), focusing on messages specifically 2025-09-15 11:55:48 -04:00
Paul Kompfner
100ef0ab5c Add unit tests for GeminiLLMAdapter.get_llm_invocation_params(), focusing on messages specifically 2025-09-15 11:38:23 -04:00
Paul Kompfner
42886d7105 Add unit tests for OpenAILLMAdapter.get_llm_invocation_params(), focusing on messages specifically. Also, fix a bug in OpenAILLMAdapter (found thanks to the unit tests) where it wasn't "unwrapping" LLMSpecificMessages. 2025-09-15 11:17:11 -04:00
Mark Backman
22cbba002a Merge pull request #2651 from pipecat-ai/mb/heygen-bot-speaking-frame
fix: push BotStartedSpeakingFrame in HeyGenVideoService
2025-09-15 08:02:25 -07:00
Filipi Fuchter
0a043154f2 Removing not used import. 2025-09-15 10:46:43 -03:00
Filipi Fuchter
5e322eba9e Supporting both single and multiple connection modes. 2025-09-15 10:43:46 -03:00
Filipi Fuchter
11d0c3d46d Refactoring SmallWebRTCRequestHandler. 2025-09-15 09:58:44 -03:00
Mark Backman
c873798ce5 fix: push BotStartedSpeakingFrame in HeyGenVideoService 2025-09-14 08:12:44 -04:00
Filipi Fuchter
95f72f6dce Creating SmallWebRTCRequestHandler for managing peer connections. 2025-09-12 18:15:24 -03:00
Aleix Conchillo Flaqué
d8cd28bb8b Merge pull request #2640 from pipecat-ai/aleix/pipecat-0.0.85
update CHANGELOG for 0.0.85
2025-09-12 11:06:41 -07:00
Aleix Conchillo Flaqué
c2df6c8aee update CHANGELOG for 0.0.85 2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué
82478be861 scripts(evals): add 19b-openai-realtime-text 2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué
0f2b7bc01b examples(foundational): fix 19b-openai-realtime-beta-text 2025-09-12 11:03:32 -07:00
Aleix Conchillo Flaqué
1b2a5df017 Merge pull request #2622 from pipecat-ai/mb/call-data-runner
Add to, from phone info and custom data to the development runner
2025-09-12 10:28:17 -07:00
Mark Backman
2f496ac74f Add optional body parameter to WebsocketRunnerArguments 2025-09-12 11:28:12 -04:00
Mark Backman
22633a63b0 Update changelog 2025-09-12 11:15:03 -04:00
Mark Backman
e5ed0424e4 Remove to/from data from Plivo, as it will rely on body information 2025-09-12 11:10:03 -04:00
Paul Kompfner
786387722a Fix an issue in AWSBedrockLLMService.run_inference—exceptions should propagate, just like with other LLM services 2025-09-12 11:09:32 -04:00
Paul Kompfner
9f82c6b4a4 Add unit tests for run_inference 2025-09-12 11:07:11 -04:00
Mark Backman
99cfcb1d4e Parsed custom data from Plivo extraHeaders 2025-09-12 08:11:30 -04:00
Mark Backman
d595676436 Add custom data handling for Twilio 2025-09-12 08:11:30 -04:00
Aleix Conchillo Flaqué
0190812ee8 Merge pull request #2639 from pipecat-ai/aleix/min-words-interruption-unit-test
MinWordsInterruptionStrategy unit test
2025-09-11 18:52:39 -07:00
Aleix Conchillo Flaqué
2a24061bbb examples(07ad): remove deprecated user_continuous_stream 2025-09-11 18:50:00 -07:00
Aleix Conchillo Flaqué
89f7e7d199 update CHANGELOG with BaseOutputTransport fix 2025-09-11 16:58:44 -07:00
Aleix Conchillo Flaqué
384814e640 Merge pull request #2456 from a6kme/patch-1
Only set last_frame_time when handling OutputAudioRawFrame
2025-09-11 16:56:25 -07:00
Aleix Conchillo Flaqué
ab4364b833 update CHANGELOG and fix formatting 2025-09-11 15:34:47 -07:00
Aleix Conchillo Flaqué
fafdadad3c Merge pull request #2473 from TheNotary/adds-interim-transcription-frame-support
adds support to Azure STT for creating InterimTranscriptFrames
2025-09-11 15:33:38 -07:00
Aleix Conchillo Flaqué
05dc2fa916 updated CHANGELOG.md with GoogleTTSService updates 2025-09-11 14:36:21 -07:00
Aleix Conchillo Flaqué
0c30cc6ea6 Merge pull request #2547 from manishkjs/feat/google-tts-voice-cloning
feat: add voice cloning and speaking rate to GoogleTTSService
2025-09-11 14:32:21 -07:00
Aleix Conchillo Flaqué
c26d336e34 Merge pull request #2545 from pipecat-ai/aleix/aws-nova-sonic-pre-load-cue
AWSNovaSonicLLMService: pre-load audio cue in the constructor
2025-09-11 14:31:26 -07:00
Mark Backman
37b6198787 Merge pull request #2635 from pipecat-ai/mb/openai-tts-speed 2025-09-11 14:22:51 -07:00
kompfner
3c271da94c Merge pull request #2633 from pipecat-ai/pk/uv-readme-updates
Updating the README to reflect that:
2025-09-11 17:07:41 -04:00
kompfner
be28d3f93b Merge pull request #2637 from pipecat-ai/pk/llm-context-evals-and-bug-fix
`LLMContext` evals and bug fix
2025-09-11 17:00:07 -04:00
marcus-daily
d2f210e960 Bundle Smart Turn v3 with Pipecat 2025-09-11 21:37:16 +01:00
Aleix Conchillo Flaqué
57add41971 tests: add unit test for MinWordsInterruptionStrategy 2025-09-11 13:07:30 -07:00
Aleix Conchillo Flaqué
74b38b59d6 tests(utils): allow passing PipelineParams to run_test() 2025-09-11 13:02:21 -07:00
kompfner
dac58deffc Merge pull request #2636 from pipecat-ai/pk/uv-lock-update-for-smart-turn-v3
uv.lock update for Smart Turn v3
2025-09-11 14:35:36 -04:00
Paul Kompfner
aff11f5121 Fix missing import in llm_response_universal.py 2025-09-11 14:33:17 -04:00
Paul Kompfner
a4023d3915 Update evals to include examples that exercise the universal LLMContext 2025-09-11 14:32:56 -04:00
Paul Kompfner
d6543d244d uv.lock update for Smart Turn v3 2025-09-11 14:07:17 -04:00
Mark Backman
fafcd79870 OpenAITTSService: add speed arg 2025-09-11 13:53:52 -04:00
Paul Kompfner
6a717fbbd1 Updating the README to reflect that:
- various dependencies that previously didn't work with Python 3.13 now seem to
- ultravox isn't fully supported on macOS
2025-09-11 12:27:43 -04:00
Aleix Conchillo Flaqué
9b3f6927c2 Merge pull request #2621 from pipecat-ai/aleix/interruption-task-frame
interruption task frame
2025-09-11 09:22:35 -07:00
Aleix Conchillo Flaqué
0b21f8a6bd FrameProcessor: add push_interruption_task_frame_and_wait() 2025-09-11 09:19:44 -07:00
Aleix Conchillo Flaqué
8249b014f0 frames: BotInterruptionFrame is deprecated, use InterruptionTaskFrame 2025-09-11 09:01:54 -07:00
Aleix Conchillo Flaqué
9d9f10ae0e frames: StartInterruptionFrame is deprecated, use InterruptionFrame 2025-09-11 09:01:54 -07:00
Aleix Conchillo Flaqué
e27b23694d frames: add new TaskFrame
TaskFrame is a base class for other frames that are meant to be sent to the
pipeline task.
2025-09-11 09:01:52 -07:00
marcus-daily
66ce5fe6bd Ruff fixes 2025-09-11 16:04:56 +01:00
marcus-daily
a9b53dc800 Update inference session options 2025-09-11 16:04:56 +01:00
marcus-daily
818352a300 Formatting 2025-09-11 16:04:56 +01:00
marcus-daily
3e9fc7be19 Update onnxruntime version 2025-09-11 16:04:56 +01:00
marcus-daily
a2e76bcad8 Smart Turn V3 support 2025-09-11 16:04:56 +01:00
Mark Backman
8e8e42717b Add to and from phone information to the development runner 2025-09-11 10:44:21 -04:00
kompfner
b31322e38e Merge pull request #2619 from pipecat-ai/pk/aws-universal-context
Expand universal `LLMContext` support to AWS Bedrock
2025-09-11 09:33:08 -04:00
Jessie Wei
305108be9a [TwilioFrameSerializer]: Add parameter validation 2025-09-10 23:00:15 +00:00
Aleix Conchillo Flaqué
908325484d Merge pull request #2614 from pipecat-ai/aleix/readme-client-sdks-table
README: update clients' table
2025-09-10 10:21:18 -07:00
Mark Backman
dd6ff789c7 Merge pull request #2628 from pipecat-ai/mb/fix-13-push-frame
fix: 13 foundational examples now push frames from TranscriptionLogger
2025-09-10 09:13:04 -07:00
Mark Backman
f4938e0fad fix: 13 foundational examples now push frames from TranscriptionLogger 2025-09-10 10:40:10 -04:00
Jessie Wei
2e1f397d17 Support TwilioFrameSerializer region/edge settings to that the call is terminated successfully for regions other than default us1 region 2025-09-10 14:30:10 +00:00
James Hush
e8f60c7c6f Handle missing rawResponse in transcription messages (#2623)
* Handle missing rawResponse in transcription messages

- Use message.get('rawResponse', {}) to safely access rawResponse field
- Default is_final to False when rawResponse is missing
- Add proper type annotations for better code clarity
- Minor import formatting cleanup

This prevents KeyError crashes when transcription messages from Daily's API
don't include the rawResponse field in edge cases.

* docs: add changelog line
2025-09-10 15:03:23 +08:00
Paul Kompfner
fedb8a201f Update 12d example to use LLMContext, now that AWS Bedrock supports it 2025-09-09 16:24:13 -04:00
Paul Kompfner
8ccd220a60 Add universal LLMContext support to AWSBedrockLLMService.run_inference() 2025-09-09 16:00:32 -04:00
Paul Kompfner
fe79de8f27 When converting universal LLMContext messages to AWS Bedrock expected format, automatically update non-initial "system"-role messages to "user"-role messages, as we do in other non-OpenAI LLM services 2025-09-09 15:50:03 -04:00
Paul Kompfner
176573c342 Add to CHANGELOG AWS Bedrock's support for universal LLMContext 2025-09-09 15:31:56 -04:00
Paul Kompfner
75f9914f49 Add support for universal LLMContext to AWS Bedrock LLM service 2025-09-09 15:25:04 -04:00
Paul Kompfner
f4d6715e32 Add foundational example using AWS Bedrock with universal LLMContext 2025-09-09 10:49:51 -04:00
kompfner
38f6e33f97 Merge pull request #2598 from pipecat-ai/pk/deprecate-vision-image-raw-frame
Remove `VisionImageRawFrame`, which was previously being handled dire…
2025-09-08 17:13:28 -04:00
Paul Kompfner
1c3e4e34e5 Minor fix to AWS Bedrock console logging to handle image messages in the context 2025-09-08 17:10:11 -04:00
Paul Kompfner
623c660027 Remove debugging comment 2025-09-08 17:01:51 -04:00
Paul Kompfner
a3e65ab3b5 The VisionImageRawFrame removal and corresponding VisionImageFrameAggregator deprecation will now happen in version 0.0.85 2025-09-08 17:01:47 -04:00
Paul Kompfner
f3a4b416df Remove VisionImageRawFrame, which was previously being handled directly by the LLM services, and deprecate the associated VisionImageFrameAggregator.
Removing `VisionImageRawFrame` lets us simplify LLM services' logic, getting us closer to the idealized architecture where all they care about is handling context frames.

This change is in service of getting us closer to ready to deprecate usage of `OpenAILLMContext` and subclasses in favor of the universal `LLMContext`, at least for the traditional text-to-text LLMs.

Why remove `VisionImageRawFrame` rather than deprecate? It's "internal"—only created by `VisionImageFrameAggregator`—and never intended to be used directly by users (it would be difficult to use directly anyway).

Move the logic that was once in `VisionImageFrameAggregator` directly into the examples. Reasoning:
- If `UserImageRequester` is defined in the examples, it makes sense for `UserImageProcessor` to be too, as it’s the flip side of the same coin, so to speak
- The logic is now pretty trivial
- This kind of one-shot, history-less image-describing pipeline shouldn't be common at all; it's ok for it to live in examples rather than as a dedicated class
- In the short term, this enables us to create `LLMContext`s for services that support it and `OpenAILLMContext`s for services that don't yet (AWS)

This commit also adds missing translation from OpenAI-format image context messages to AWS format. Note that this isn't a wasted effort in the face of the upcoming migration to universal `LLMContext`—this work will be reused as it has to be implemented there too.
2025-09-08 17:00:08 -04:00
Aleix Conchillo Flaqué
aa471a4ef5 update CHANGELOG with LiveKitTransport updates 2025-09-08 13:53:21 -07:00
Aleix Conchillo Flaqué
d55133a44f Merge pull request #2604 from alexyzhou/feature/livekit_video_and_bug_fix
Feature: Add support for livekit video stream and minor bug fixes
2025-09-08 13:51:14 -07:00
Aleix Conchillo Flaqué
0f1cf81691 README: update clients' table 2025-09-08 12:08:32 -07:00
kompfner
ac4d335799 Merge pull request #2613 from pipecat-ai/pk/mistral-message-fixups
Apply additional fixups to context messages to meet Mistral-specific …
2025-09-08 13:59:54 -04:00
Paul Kompfner
e65385c151 Tweak the Mistral-specific context messages fixup logic to handle the (mostly academic) possibility of a "tool" message appearing at the end 2025-09-08 13:55:09 -04:00
Paul Kompfner
0bb7df7a6b Remove stray debugging message 2025-09-08 13:38:26 -04:00
Paul Kompfner
daee1ddf3b Apply additional fixups to context messages to meet Mistral-specific requirements 2025-09-08 11:26:58 -04:00
Adithya Suresh
d1f72c1c0b Added usage metrics 2025-09-08 14:44:25 +10:00
Aleix Conchillo Flaqué
1cccb97ccf Merge pull request #2608 from pipecat-ai/aleix/deprecate-noisereducefilter
audio(filters): deprecate NoisereduceFilter
2025-09-07 20:54:09 -07:00
Aleix Conchillo Flaqué
d7794abf21 audio(filters): deprecate NoisereduceFilter 2025-09-07 20:52:17 -07:00
Aleix Conchillo Flaqué
6a6a63a532 Merge pull request #2607 from pipecat-ai/aleix/scripts-evals-improve-eval-prompt
scripts(evals): allow user to talk and only eval when needed
2025-09-07 20:49:43 -07:00
Mark Backman
6edb6fed41 Merge pull request #2606 from pipecat-ai/mb/quickstart-lockfile
Remove uv.lock from quickstart
2025-09-07 06:10:14 -07:00
Mark Backman
a537382816 Add OpenAIRealtimeLLMService, AzureRealtimeLLMService (#2596)
* Add OpenAI Realtime module

* Add foundational examples for OpenAI Realtime

* Add deprecation warning to OpenAIRealtimeBetaLLMService

* Add deprecation warning to AzureRealtimeBetaLLMService

* Update Changelog
2025-09-07 09:09:57 -04:00
Aleix Conchillo Flaqué
46deaada70 scripts(evals): allow user to talk and only eval when needed 2025-09-06 19:19:08 -07:00
TheNotary
7366b1aee0 adds missing InterimTranscriptionFrame import 2025-09-06 14:40:19 -05:00
Mark Backman
dbc52bc6b0 Remove uv.lock from quickstart 2025-09-06 11:13:50 -04:00
Alex Zhou
d6432589f6 fix: fix format and lint by ruff 2025-09-06 10:50:47 +08:00
Alex Zhou
13b73d4406 feat: Add support for pipecat video stream; fix the bug of duplicate participants when connecting; fix the bug of RTVI messages sent via livekit messages; 2025-09-06 10:41:33 +08:00
Aleix Conchillo Flaqué
85d8282f7e Merge pull request #2602 from pipecat-ai/aleix/pipecat-0.0.84
update CHANGELOG for 0.0.84
2025-09-05 19:35:26 -07:00
Aleix Conchillo Flaqué
070690ec64 update CHANGELOG for 0.0.84 2025-09-05 18:22:50 -07:00
Aleix Conchillo Flaqué
b9c96fd623 Merge pull request #2601 from pipecat-ai/aleix/daily-python-0.19.9
pyproject: update daily-python to 0.19.9
2025-09-05 18:21:49 -07:00
Aleix Conchillo Flaqué
f8b2ab6331 pyproject: update daily-python to 0.19.9 2025-09-05 18:14:57 -07:00
Mark Backman
ea3f7e3c34 Merge pull request #2600 from pipecat-ai/mb/livekit-dtmf
LiveKitTransport: Add support to send DTMF
2025-09-05 15:25:32 -07:00
Mark Backman
2f44f88b08 LiveKitTransport: Add support to send DTMF 2025-09-05 18:23:04 -04:00
Mark Backman
25747a001b Merge pull request #2599 from pipecat-ai/mb/fix-daily-dtmf
DTMF: Add support for native DTMF implementation where available
2025-09-05 15:20:05 -07:00
Mark Backman
fbe4338440 DTMF: Add support for native DTMF implementation where available 2025-09-05 18:16:56 -04:00
Filipi da Silva Fuchter
64b4c65728 Merge pull request #2595 from pipecat-ai/filipi/heygen_quality
Improving HeyGen example video quality.
2025-09-05 17:19:25 -03:00
kompfner
29442969a9 Merge pull request #2597 from pipecat-ai/pk/fix-anthropic-tool-less-usage
Fix Anthropic tool-less usage
2025-09-05 15:30:29 -04:00
Paul Kompfner
dc2e1d4ad3 Fix Anthropic tool-less usage 2025-09-05 11:47:31 -04:00
Filipi Fuchter
5477dfcbea Improving HeyGen example video quality. 2025-09-05 11:30:01 -03:00
kompfner
516f0e08ab Merge pull request #2590 from pipecat-ai/pk/gemini-multimodal-live-doesnt-support-llm-context
Raise an error when attempting to use Gemini Multimodal Live with uni…
2025-09-05 09:22:33 -04:00
Paul Kompfner
246f9f3325 Raise an error when attempting to use Gemini Multimodal Live with universal LLMContext. This is exactly the same error we already have for the other s2s models, AWS Nova Sonic and OpenAI Realtime, it was just missing from this service. 2025-09-04 16:47:08 -04:00
Manish Kumar
4699ee8d86 docs: add docstring for voice_cloning_key and update CHANGELOG 2025-09-04 22:45:51 +05:30
kompfner
3d850e8cc5 Merge pull request #2574 from pipecat-ai/pk/expand-universal-llm-context-support-to-anthropic
Expand universal `LLMContext` support to Anthropic
2025-09-04 13:09:44 -04:00
Paul Kompfner
6e734a37f9 Fix a bug in AWSBedrockLLMService.run_inference(); it was expecting the wrong format for the system instruction 2025-09-04 13:04:15 -04:00
Paul Kompfner
f72ca2fd7d Remove unnecessary system_instruction argument from run_inference() methods 2025-09-04 13:04:15 -04:00
Paul Kompfner
0826d72f74 Add deprecation warning for using enable_prompt_caching_beta param 2025-09-04 13:04:15 -04:00
Paul Kompfner
ba5ebfa0ec Fixed subtle CHANGELOG conflict after release of 0.0.83: universal LLMContext support for Anthropic didn't make that release. Also, some automatic Prettier fixes. 2025-09-04 13:04:11 -04:00
Paul Kompfner
dc3412b2df Bump a deprecation to 0.0.84, as 0.0.83 just shipped 2025-09-04 13:03:06 -04:00
Paul Kompfner
b2e9fd9341 Rename Anthropic enable_prompt_caching_beta parameter to just enable_prompt_caching 2025-09-04 13:03:06 -04:00
Paul Kompfner
c11b207c97 Add Anthropic to CHANGELOG list of services newly supporting runtime LLM switching 2025-09-04 13:03:06 -04:00
Paul Kompfner
d6205027cf Trivial cleanup 2025-09-04 13:03:06 -04:00
Paul Kompfner
986160c077 Fix a bug where the Anthropic adapter's merge-consecutive-messages-with-the-same-role logic was unintentionally affecting the source LLMContext's messages, resulting in more and more duplication of text with each inference 2025-09-04 13:03:06 -04:00
Paul Kompfner
b56ff86fee Minor refactor of AnthropicLLMAdapter cache-control-marker-adding logic (without really changing its behavior) 2025-09-04 13:03:06 -04:00
Paul Kompfner
5c574eaad9 Add support for universal LLMContext to Anthropic LLM service 2025-09-04 13:03:06 -04:00
Paul Kompfner
2df231143a Add foundational example using Anthropic with universal LLMContext 2025-09-04 13:03:06 -04:00
Aleix Conchillo Flaqué
e3597801d4 AWSNovaSonicLLMService: pre-load audio cue in the constructor 2025-09-04 09:31:39 -07:00
Aleix Conchillo Flaqué
65298ab792 update CHANGELOG with AWSBedrockLLMService fix 2025-09-04 09:24:55 -07:00
Aleix Conchillo Flaqué
b609b02614 Merge pull request #2568 from ezisezis/fix-bedrock-timeouts
fix timeout handling in AWSBedrockLLMService
2025-09-04 09:23:28 -07:00
Aleix Conchillo Flaqué
f2b50c14d2 Merge pull request #2573 from pipecat-ai/vp-minor-fixes-07s
example 07s: minor typo updates
2025-09-04 09:21:32 -07:00
Aleix Conchillo Flaqué
ee3b023986 update CHANGELOG with OpenAIImageGenService fix 2025-09-04 09:20:02 -07:00
Aleix Conchillo Flaqué
0d9e1190d7 Merge pull request #2583 from sassanh/main
fix: openai image generator now initiates URLImageRawFrame with correct order of arguments
2025-09-04 09:17:51 -07:00
Mark Backman
595a7c7fbe Merge pull request #2587 from pipecat-ai/mb/update-quickstart-0.0.83
Update quickstart pyproject to use 0.0.83
2025-09-04 07:42:56 -07:00
Mark Backman
586586f743 Update quickstart pyproject to use 0.0.83 2025-09-04 10:36:58 -04:00
Mark Backman
a1c6ad539d Merge pull request #2585 from ashotbagh/feat/asyncai-multilingual-support
feat(asyncai): add multilingual TTS support
2025-09-04 05:03:45 -07:00
Ashot
daf7fed8b3 feat(asyncai): add multilingual TTS support 2025-09-04 13:58:50 +04:00
Adithya Suresh
446d99d194 Bug fixes 2025-09-04 17:08:16 +10:00
Adithya Suresh
cbdbdee4c0 Initial StrandsAgentsProcessor implementation 2025-09-04 16:58:58 +10:00
Sassan Haradji
a26647c433 fix: openai image generator now initiates URLImageRawFrame with correct order of arguments 2025-09-04 06:09:57 +03:30
Aleix Conchillo Flaqué
0fab56fc13 Merge pull request #2577 from pipecat-ai/aleix/pipecat-0.0.83
update CHANGELOG for 0.0.83
2025-09-03 16:49:24 -07:00
Aleix Conchillo Flaqué
f0baff94b2 update CHANGELOG for 0.0.83 2025-09-03 16:47:43 -07:00
Aleix Conchillo Flaqué
d146170fd6 Merge pull request #2580 from pipecat-ai/mb/add-cerebras-evals
Add 14k (CerebrasLLMService) to release evals
2025-09-03 15:08:28 -07:00
Filipi da Silva Fuchter
001a2d36e5 Merge pull request #2579 from pipecat-ai/filipi/input_message
Creating InputTransportMessageUrgentFrame
2025-09-03 19:01:07 -03:00
Filipi Fuchter
99e237b1e2 Fixed an issue where messages received from the transport were always being resent. 2025-09-03 18:58:34 -03:00
Aleix Conchillo Flaqué
978f644f19 Merge pull request #2578 from pipecat-ai/aleix/user-speaking-frame
add UserSpeakingFrame and UserStartedSpeakingFrame/UserStopeedSpeakingFrame updates
2025-09-03 14:55:45 -07:00
Aleix Conchillo Flaqué
5a4c6b9618 BaseInputTransport: push UserStartedSpeakingFrame/UserStoppedSpeakingFrame upstream 2025-09-03 14:32:32 -07:00
Mark Backman
977a57c8fb Add 14k (CerebrasLLMService) to release evals 2025-09-03 17:11:38 -04:00
Mark Backman
c64bc5a636 Merge pull request #2576 from joyceerhl/joyce/cerebras-default
fix: update default Cerebras model to GPT-OSS-120B
2025-09-03 14:10:28 -07:00
Joyce Er
eba006d39c Fix nits 2025-09-03 14:07:49 -07:00
Joyce Er
a001f6f193 Switch to GPT-OSS-120B 2025-09-03 14:00:27 -07:00
Aleix Conchillo Flaqué
09d6ec1098 introduce and push UserSpeakingFrame upstream/downstream 2025-09-03 13:56:01 -07:00
Aleix Conchillo Flaqué
f56be9315a Merge pull request #2572 from pipecat-ai/aleix/deepgram-disconnect-task
ParallePipeline: wait for CancelFrame in all branches
2025-09-03 13:10:55 -07:00
Mark Backman
8e5880b2e7 Merge pull request #2575 from pipecat-ai/mb/fix-foundational-frame-direction
fix: Specify frame direction in 06a push_frame
2025-09-03 12:46:10 -07:00
Joyce Er
d8ac6f2c1a fix: update default Cerebras model to Qwen 3 32B 2025-09-03 12:23:36 -07:00
Mark Backman
052ffe8712 fix: Specify frame direction in 06a push_frame 2025-09-03 15:07:05 -04:00
Aleix Conchillo Flaqué
b52296450c DeepgramSTTService: remove raising CancelledError 2025-09-03 11:24:02 -07:00
Aleix Conchillo Flaqué
c71cec04d3 ParallelPipeline: wait for CancelFrame in all branches 2025-09-03 11:23:37 -07:00
vipyne
83f64ecd3b example 07s: minor typo updates 2025-09-03 12:11:07 -05:00
Aleix Conchillo Flaqué
d19170d8b1 Merge pull request #2565 from pipecat-ai/aleix/reorganize-transports
transports: reorganize module
2025-09-03 08:52:49 -07:00
Mark Backman
8b95d74193 Merge pull request #2571 from pipecat-ai/mb/fix-docs-0.0.83
Fix docs generation before 0.0.83 release
2025-09-03 08:52:20 -07:00
Mark Backman
3c4694a8f1 Fix docs generation before 0.0.83 release 2025-09-03 11:31:14 -04:00
kompfner
b9748b1228 Merge pull request #2563 from pipecat-ai/pk/expand-universal-llm-context-support-to-more-llms
Expand universal `LLMContext` support to more LLMs
2025-09-03 11:20:26 -04:00
Paul Kompfner
def1cf1548 Update CHANGELOG with entry about expanded support among LLM services for new universal LLMContext 2025-09-03 09:07:57 -04:00
Paul Kompfner
9b216116f1 Remove supports_universal_context gate from OpenAI-library-based LLM services. It's no longer needed, as they all now support universal LLMContext. 2025-09-03 09:07:18 -04:00
Paul Kompfner
7cb372ebb9 Add support for universal LLMContext to Together.ai LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
6838bc1e51 Add a few missing LLMContext type hints 2025-09-03 09:07:18 -04:00
Paul Kompfner
e04f42167e Add support for universal LLMContext to SambaNova LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
91a3f63e28 Add support for universal LLMContext to Qwen LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
b24eb76559 Add QWEN_API_KEY to env.example 2025-09-03 09:07:18 -04:00
Paul Kompfner
d9ea02595b Add support for universal LLMContext to Ollama LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
5bc0e49baa Add support for universal LLMContext to Perplexity LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
ec138b97d9 Add support for universal LLMContext to OpenRouter LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
0c32cc29a7 Add support for universal LLMContext to OpenPipe LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
d740bab99e Add support for universal LLMContext to NVIDIA NIM LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
ac62183eb6 Add NVIDIA_API_KEY to env.example 2025-09-03 09:07:18 -04:00
Paul Kompfner
34f823bcac Add support for universal LLMContext to Groq LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
b4e1051066 Add support for universal LLMContext to Grok LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
d8882bc381 Add support for universal LLMContext to Google Vertex AI LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
da18d0a562 Add support for universal LLMContext to Fireworks AI LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
f8e13a82cf Fix Fireworks AI function calling example 2025-09-03 09:07:18 -04:00
Paul Kompfner
2b00d37e94 Add support for universal LLMContext to DeepSeek LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
2dbd17da4d Fix Cerebras function calling example 2025-09-03 09:07:18 -04:00
Paul Kompfner
d45fbd5455 Add support for universal LLMContext to Cerebras LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
b22bdff6d0 Add support for universal LLMContext to Azure LLM service 2025-09-03 09:07:18 -04:00
Paul Kompfner
2b286365e0 Add MISTRAL_API_KEY to env.example 2025-09-03 09:07:18 -04:00
Eduards Klavins
0a3e98857e fix timeout handling in AWSBedrockLLMService 2025-09-03 11:52:30 +03:00
Aleix Conchillo Flaqué
aeb9f1ffca transports: reorganize module 2025-09-02 17:31:39 -07:00
Filipi da Silva Fuchter
7f1100bd4c Merge pull request #2513 from pipecat-ai/filipi/whatsapp
Support for the new WhatsApp Cloud API
2025-09-02 18:06:59 -03:00
Filipi Fuchter
8fbd9b5af7 Added support for WhatsApp User-initiated Calls. 2025-09-02 18:05:00 -03:00
Filipi Fuchter
49c1f0bd08 Fixed SmallWebRTCTransport to not use mid to decide if the transceiver should be sendrecv or not. 2025-09-02 18:04:51 -03:00
Aleix Conchillo Flaqué
ce7a0512f9 Merge pull request #2562 from pipecat-ai/aleix/ai-coustics-speech-enhancement
add ai-coustics speech enhancement filter
2025-09-02 13:13:28 -07:00
Aleix Conchillo Flaqué
fdcd14dd21 updated CHANGELOG with AICFilter and fix deprecations 2025-09-02 13:10:10 -07:00
Mark Backman
0386599163 Added Daily SIP room creation utility (#2560)
* Added Daily SIP room creation utility to configure() function

* Add sip_codecs to the DailyRoomSipParams
2025-09-02 14:12:04 -04:00
Corvin Jaedicke
c1ce3d7d2b bumped aic-sdk version to v1.0.1 with minor changes 2025-09-02 11:11:29 -07:00
Corvin Jaedicke
8ecece2d9c Add AIC SDK audio filter 2025-09-02 11:11:29 -07:00
Filipi da Silva Fuchter
0d8ab7abca Merge pull request #2552 from pipecat-ai/filipi/freeze_issues
Fixed an issue where the pipeline could freeze.
2025-09-02 14:43:08 -03:00
Filipi Fuchter
dea7c22020 Fixed an issue where the pipeline could freeze. 2025-09-02 13:58:41 -03:00
Mark Backman
cfe11267f4 Merge pull request #2546 from pipecat-ai/mb/update-changelog-mem0
Add mem0 changelog entry
2025-09-02 07:22:52 -07:00
Mark Backman
d0c97d3602 Add mem0 changelog entry 2025-09-02 10:03:17 -04:00
Mark Backman
37e1551abc Merge pull request #2555 from pipecat-ai/mb/update-uv-lock-quickstart 2025-09-02 04:19:15 -07:00
Mark Backman
e1477e79f0 Merge pull request #2538 from rimelabs/rime-flush-audio-update
Use Rime’s official {"operation": "flush"} command in flush_audio() for proper text buffer flushing
2025-09-01 18:01:20 -07:00
Mark Backman
547b126d98 Update required pipecat-ai version for quickstart 2025-09-01 20:52:44 -04:00
Mark Backman
447e3b28eb Update uv.lock for quickstart 2025-09-01 20:52:12 -04:00
gokuljs
472efa2971 ruff format fix 2025-09-02 04:25:28 +05:30
Mark Backman
64486ef50b Merge pull request #2536 from gladiaio/PLA-38-missing-config-parameters
Gladia - add missing config parameters
2025-09-01 12:42:10 -07:00
gokuljs
5f801743d0 Add changelog entry for RimeTTSService flush_audio API update 2025-09-01 22:16:02 +05:30
Fabrice Lamant
802c5d04f4 update changelog 2025-09-01 10:21:11 +02:00
Aleix Conchillo Flaqué
83b90da53a Merge pull request #2537 from pipecat-ai/aleix/pipeline-task-cleanup-observers
PipelineTask: cleanup observers
2025-08-31 13:44:38 -07:00
Aleix Conchillo Flaqué
1f49de5cdf Merge pull request #2542 from pipecat-ai/aleix/remove-stop-interruption-frame
frames: remove StopInterruptionFrame
2025-08-31 13:44:22 -07:00
Manish Kumar
2ee481d541 feat: add voice cloning and speaking rate to GoogleTTSService 2025-08-30 23:04:59 +05:30
Mark Backman
7cf099eae7 Merge pull request #2541 from parshvadaftari/user/parshva/update_mem0
Update mem0 integration
2025-08-30 05:11:31 -07:00
Mark Backman
93a8ea3cb2 Merge pull request #2543 from pipecat-ai/mb/docs-extensions
Add Extensions to ref docs generation
2025-08-30 04:20:03 -07:00
Aleix Conchillo Flaqué
776aafddfb Merge pull request #2534 from pipecat-ai/aleix/pyright-1.1.404
pyproject: update pyright and ruff
2025-08-29 19:55:54 -07:00
Mark Backman
d56762262a Fix docs build errors 2025-08-29 20:24:35 -04:00
Mark Backman
bbcf35d657 Add Extensions to reference docs generation 2025-08-29 20:17:34 -04:00
Mark Backman
972546b24f Add IVR navigation (#2529) 2025-08-29 20:08:17 -04:00
Aleix Conchillo Flaqué
8b351f5bec pyproject: update pyright and ruff 2025-08-29 17:02:13 -07:00
Aleix Conchillo Flaqué
bd7d9346b7 frames: remove StopInterruptionFrame 2025-08-29 16:40:01 -07:00
Aleix Conchillo Flaqué
81325be4f3 Merge pull request #2540 from pipecat-ai/aleix/dtmf-tones-slower
audio(dtmf): use longer tones and longer gaps
2025-08-29 15:15:01 -07:00
Aleix Conchillo Flaqué
399f8de6ef audio(dtmf): use longer tones and longer gaps 2025-08-29 15:10:20 -07:00
parshvadaftari
60c070e077 update mem0 integration for reduced latency and better performance 2025-08-30 02:27:36 +05:30
gokuljs
e3f2faabf7 Merge branch 'main' of github.com:rimelabs/pipecat into rime-flush-audio-update 2025-08-30 01:18:50 +05:30
Aleix Conchillo Flaqué
b5a644dd6f PipelineTask: cleanup observers 2025-08-29 10:54:36 -07:00
gokuljs
e06bd6049e update flush operation in flush audio function under rime tts 2025-08-29 21:27:14 +05:30
Fabrice Lamant
25b595e125 add suggestions 2025-08-29 14:51:20 +02:00
Fabrice Lamant
edc8cc1e69 remove sample_rate from GladiaInputParams 2025-08-29 14:00:00 +02:00
Fabrice Lamant
633dd69dee feat: add logging for pipecat version and session url 2025-08-29 13:47:16 +02:00
Fabrice Lamant
1a1d5a1081 feat: add missing config params 2025-08-29 13:46:44 +02:00
Aleix Conchillo Flaqué
c1b8d2acab Merge pull request #2532 from pipecat-ai/aleix/universal-dtmf-support
Universal DTMF support
2025-08-28 21:04:13 -07:00
Aleix Conchillo Flaqué
ea368e4c5f scripts(dtmf): added generate_dtmf.sh to generate DTMF wav files 2025-08-28 21:01:41 -07:00
Aleix Conchillo Flaqué
f03deb6ecc DailyTransport: remove send_dtmf() and write_dtmf() 2025-08-28 21:01:41 -07:00
Aleix Conchillo Flaqué
0e01ac8ef6 BaseOutputTransport: implement generic write_dtmf() 2025-08-28 21:01:41 -07:00
Aleix Conchillo Flaqué
5787743ab3 audio(dtmf): added DTMF audio files and load_dtmf_audio() 2025-08-28 21:01:41 -07:00
Aleix Conchillo Flaqué
79be0695dd make sure warnings are always displayed 2025-08-28 17:43:29 -07:00
Aleix Conchillo Flaqué
a5c5e069ba move pipecat.frames.frames.KeypadEntry to pipecat.audio.dtmf.types.KeypadEntry 2025-08-28 17:43:29 -07:00
Aleix Conchillo Flaqué
77c34076f7 Merge pull request #2531 from pipecat-ai/aleix/pipecat-0.0.82
update CHANGELOG for 0.0.82
2025-08-28 13:04:41 -07:00
Aleix Conchillo Flaqué
d67cece356 update CHANGELOG for 0.0.82 2025-08-28 13:02:47 -07:00
Aleix Conchillo Flaqué
275c8b59c5 MistralLLMService: fix build_chat_completion_params() 2025-08-28 12:04:14 -07:00
Aleix Conchillo Flaqué
5ebcea2a3b scripts(eval): change "result" function call parameter 2025-08-28 11:38:59 -07:00
Aleix Conchillo Flaqué
64f2135ddc examples(14f): use default models 2025-08-28 11:38:59 -07:00
kompfner
a74231f036 Merge pull request #2515 from pipecat-ai/pk/llm-run-frame
Add `LLMRunFrame` to trigger an LLM response, replacing `context_aggr…
2025-08-28 10:01:00 -04:00
Paul Kompfner
189749b579 Add LLMRunFrame to trigger an LLM response, replacing context_aggregator.user().get_context_frame() 2025-08-28 09:53:33 -04:00
Aleix Conchillo Flaqué
e384ca949e Merge pull request #2512 from pipecat-ai/aleix/textframe-skip-tts
TextFrame: add skip_tts field
2025-08-27 16:26:03 -07:00
Aleix Conchillo Flaqué
eb248fedc1 add skip_tts to LLMFullResponseStartFrame/LLMFullResponseEndFrame 2025-08-27 16:23:27 -07:00
Aleix Conchillo Flaqué
16f57be72c LLMConfigureOutputFrame: allow configuring LLM output 2025-08-27 16:23:27 -07:00
Aleix Conchillo Flaqué
5803936838 TextFrame: add skip_tts field
This lets a text frame bypass TTS while still being included in the LLM
context. Useful for cases like structured text that isn’t meant to be spoken but
should still contribute to context.
2025-08-27 16:23:27 -07:00
Aleix Conchillo Flaqué
d9837dd1e5 Merge pull request #2527 from pipecat-ai/aleix/daily-python-0.19.8
pyproject: update daily-python to 0.19.8
2025-08-27 16:22:49 -07:00
Aleix Conchillo Flaqué
e48c9fc3e2 pyproject: update daily-python to 0.19.8 2025-08-27 16:00:36 -07:00
Aleix Conchillo Flaqué
3c4454a33e Merge pull request #2526 from pipecat-ai/aleix/pipeline-task-wait-for-startframe
PipelineTask: wait for StartFrame to reach end of pipeline
2025-08-27 15:57:10 -07:00
Aleix Conchillo Flaqué
2a0780e6ef PipelineTask: wait for StartFrame to reach end of pipeline
Fixes #2498
2025-08-27 14:23:09 -07:00
Aleix Conchillo Flaqué
5e121346fb Merge pull request #2516 from pipecat-ai/aleix/rtvi-client-version-check
RTVIProcessor: make check sure client version is set
2025-08-27 14:02:14 -07:00
Aleix Conchillo Flaqué
2bdca8d22c RTVIProcessor: make check sure client version is set 2025-08-27 13:36:11 -07:00
Aleix Conchillo Flaqué
1f5888bcf7 Merge pull request #2517 from pipecat-ai/aleix/unify-get-messages-for-logging
unify get_messages_for_logging()
2025-08-27 12:49:36 -07:00
Mark Backman
3d09f9a2af Merge pull request #2524 from pipecat-ai/mb/cartesia-speed
Cartesia: update speed InputParam
2025-08-27 12:47:29 -07:00
Aleix Conchillo Flaqué
cd3563bb16 unify get_messages_for_logging()
Some implementations were returing a list and some were returning a JSON
string. They should all return a list and the user would decide if it wants to
transform that into JSON.
2025-08-27 12:45:24 -07:00
Aleix Conchillo Flaqué
3e79ef4118 Merge pull request #2525 from pipecat-ai/aleix/daily-fix-send-dtmf
DailyTransport: fix sending DTMF tones
2025-08-27 12:44:27 -07:00
Aleix Conchillo Flaqué
2613da1a1f PipelineTask: increase CANCEL_TIMEOUT_SECS to 20 2025-08-27 11:50:48 -07:00
Aleix Conchillo Flaqué
41d40f9a11 DailyTransport: make sure we have a client before joining/leaving 2025-08-27 11:50:48 -07:00
Aleix Conchillo Flaqué
74af2b6aa4 DailyTransport: fix sending DTMF tones 2025-08-27 11:50:48 -07:00
Mark Backman
f7d9f32b0f Cartesia: update speed InputParam 2025-08-27 13:34:28 -04:00
Mark Backman
6074af60ef Merge pull request #2521 from pipecat-ai/mb/update-quickstart-pcc-docker
Update quickstart to use pcc docker command
2025-08-27 08:13:31 -07:00
Mark Backman
7ef6893c0d Merge pull request #2523 from sam-s10s/fix/connection-none
Speechmatics TTS connection issue
2025-08-27 08:09:46 -07:00
Sam Sykes
cc5557e051 changelog 2025-08-27 16:07:31 +01:00
Sam Sykes
06f7a92c99 fix to finally statement 2025-08-27 14:43:07 +01:00
Mark Backman
61a333ccae Update quickstart to use pcc docker command 2025-08-26 21:29:13 -04:00
Mark Backman
fc3d84dff7 Merge pull request #2501 from pipecat-ai/mb/aws-tts-more-flexible-auth
Support additional authentication mechanisms for AWS services
2025-08-26 18:05:37 -07:00
Mark Backman
86a37d8cea Add changelog entry for SentryMetrics missing import fix 2025-08-26 21:00:16 -04:00
Mark Backman
3f66acf9f1 Merge pull request #2520 from geluso/bugfix-missing-asyncio-import
add missing import asyncio
2025-08-26 17:59:25 -07:00
Mark Backman
facfaa2dd4 AWSBedrockLLMService: Allow setting auth credentials via env vars 2025-08-26 20:59:12 -04:00
Mark Backman
8250c381d1 AWSPollyTTSService: allow setting auth credentials through provider chain 2025-08-26 20:58:02 -04:00
Steve Geluso
32f9e48865 add missing import asyncio 2025-08-26 17:40:11 -07:00
Filipi Fuchter
76eef837b6 Removing watchdog from SarvamTTSService. 2025-08-26 18:44:58 -03:00
Filipi Fuchter
c9aaa463b7 Mentioning the recent SarvamTTSService changes in the changelog. 2025-08-26 18:44:58 -03:00
pratham-sarvam
6d582e41b7 Added Sarvam TTS Websocket Implementation (#2356)
* Added Sarvam TTS Websocket Implementation

* Addressed some of the comments on PR

* added change voice logic

* added changes from main

* pushing text frames and added flush audio

* updated docs string for better docs

* Addressed comments and added some improvements

* pushed optional args down

* removed new line

* made aiohttp session mandatory in http service

* added push frame and removed unused function

* removed pong message

* added disconnecting logic

---------

Co-authored-by: vinayak-sarvam <vinayak@sarvam.ai>
2025-08-26 18:10:26 -03:00
kompfner
ca29f62bff Merge pull request #2510 from pipecat-ai/pk/fix-set-tools-types
Update types for tools in `LLMSetToolsFrame` and `LLMContextAggregato…
2025-08-26 14:12:21 -04:00
Aleix Conchillo Flaqué
0dced68c3c Merge pull request #2511 from pipecat-ai/aleix/end-of-pipline-warning
PipelineTask: warn if CancelFrame doesn't reach the end
2025-08-26 11:02:26 -07:00
Aleix Conchillo Flaqué
8ab81d289a PipelineTask: warn if CancelFrame doesn't reach the end 2025-08-26 10:36:33 -07:00
Paul Kompfner
f457d00760 Update types for tools in LLMSetToolsFrame and LLMContextAggregator.set_tools(), for two reason:
1. `ToolsSchema` has been supported in `LLMSetToolsFrame` for a while but wasn't properly reflected in these type hints
2. The new universal `LLMContext` expects tools to be either a `ToolsSchema` or `NOT_GIVEN`.
2025-08-26 11:32:21 -04:00
kompfner
f5118c4412 Merge pull request #2440 from pipecat-ai/pk/prototype-llm-failover-attempt-4
Support for runtime LLM switching
2025-08-26 09:55:03 -04:00
Paul Kompfner
a79fe40162 Fix a typo in the CHANGELOG 2025-08-26 09:51:48 -04:00
Paul Kompfner
dcb4949e20 Move ServiceSwitcherFrame and ManuallySwitchServiceFrame to frames.py 2025-08-26 09:47:37 -04:00
Paul Kompfner
8b543e558d Add CHANGELOG entry describing LLMService.run_inference() 2025-08-26 09:47:32 -04:00
Paul Kompfner
8181962236 Add CHANGELOG entry describing LLM switcher 2025-08-26 09:46:51 -04:00
Paul Kompfner
98dc891640 Move CHANGELOG log entry from 0.0.81 to Unreleased 2025-08-26 09:45:49 -04:00
Paul Kompfner
71de0da570 ServiceSwitchers are now controlled using frames rather than with direct method calls 2025-08-26 09:44:15 -04:00
Paul Kompfner
b40c8bb81d Refactor LLMSwitcher into a base ServiceSwitcher and an LLMSwitcher that subclasses it 2025-08-26 09:44:15 -04:00
Paul Kompfner
43f1b59b86 Convert LLM generate_summary() methods to the more generic run_inference() 2025-08-26 09:44:15 -04:00
Paul Kompfner
a0a2bb3aa4 In GeminiLLMAdapter, when translating from the universal LLMContext format, only pull out the first "system" message as the system instruction, and convert subsequent ones into "user" messages. This is a more correct thing to do than simply drop subsequent "system" messages, especially when potentially sharing a context between multiple LLMs. 2025-08-26 09:44:15 -04:00
Paul Kompfner
04a50df3d5 Add LLMSwitcher, with LLMSwitcherStrategyManual as the first supported switching strategy 2025-08-26 09:44:15 -04:00
Paul Kompfner
8c0edffaff Fix bug in AWS Bedrock conversation summarization. It was using an out-of-date pattern (the _client property no longer exists) 2025-08-26 09:44:15 -04:00
Paul Kompfner
fe6063fdbe Introduce an affordance to LLMService for generating a summary of a conversation directly (i.e. without going through the pipeline).
This abstraction will allow us to update Pipecat Flows to avoid reaching into LLM service internals to generate summaries.

In addition to being a helpful refactor to remove a fragile part of Pipecat Flows, this change helps set the stage for supporting the upcoming `LLMSwitcher`, where the “active” LLM will only be determined at runtime—today, Pipecat Flows needs to know ahead of time what type of LLM it’s working with, to load an LLM-specific “adapter” that does the work of generating summaries, among other things.
2025-08-26 09:44:15 -04:00
Paul Kompfner
195146adb2 Bump deprecation warning version, as this commit is not expected to ship until version 0.0.82. 2025-08-26 09:44:15 -04:00
Paul Kompfner
cab9e18cc9 Port recent change to LLMAssistantContextAggregator to universal LLMAssistantAggregator 2025-08-26 09:44:15 -04:00
Paul Kompfner
baef688e4e Port recent changes to LLMUserContextAggregator to universal LLMUserAggregator 2025-08-26 09:44:15 -04:00
Paul Kompfner
f1f43fe500 After a rebase, rename foundational examples showing usage of universal context to avoid naming conflict with a recently-added example. 2025-08-26 09:44:15 -04:00
Paul Kompfner
73b63f8d35 Remove unnecessary import 2025-08-26 09:44:15 -04:00
Paul Kompfner
0c14b33e92 Deprecate GoogleLLMOpenAIBetaService 2025-08-26 09:44:15 -04:00
Paul Kompfner
09beaccaf0 Assorted minor improvements after code review 2025-08-26 09:44:15 -04:00
Paul Kompfner
40557a1aae Remove TODO comment 2025-08-26 09:44:15 -04:00
Paul Kompfner
ecc4cc4a79 Add support for universal LLMContext to RTVIObserver 2025-08-26 09:44:15 -04:00
Paul Kompfner
37be8805f4 ruff 2025-08-26 09:44:15 -04:00
Paul Kompfner
93c7e64995 Add missing PERPLEXITY_API_KEY in env.example 2025-08-26 09:44:15 -04:00
Paul Kompfner
9de2bd61a9 Add supports_universal_context for OpenAILLMService subclasses so that we can gradually roll out support for universal LLMContext in a controlled manner.
Also update `get_chat_completions()` implementations with the new argument type.
2025-08-26 09:44:15 -04:00
Paul Kompfner
566af71862 Add CHANGELOG entry for the universal LLMContext machinery 2025-08-26 09:44:15 -04:00
Paul Kompfner
12064bd6e6 Add a bit of helpful info in an error message 2025-08-26 09:44:15 -04:00
Paul Kompfner
a962459151 Change LLMContextAggregatorPair.create(context) to LLMContextAggregatorPair(context) 2025-08-26 09:44:15 -04:00
Paul Kompfner
8fc76a29bc Raise errors when trying to use universal LLMContext with LLM services that don't yet support it 2025-08-26 09:44:15 -04:00
Paul Kompfner
e3019261a5 Fix classes that subclass BaseLLMAdapter by adding placeholder stuff until support for universal LLMContext machinery comes to all LLM services 2025-08-26 09:44:15 -04:00
Paul Kompfner
fa1f6f1c51 In LLMContext, normalize an empty provided ToolsSchema to NOT_GIVEN 2025-08-26 09:44:15 -04:00
Paul Kompfner
337f00c16c Minor fix: add a type annotation 2025-08-26 09:44:15 -04:00
Paul Kompfner
d50922cdcd Update Google adapter to handle possibility of system message in standard format being provided as a list of text parts rather than just a string. 2025-08-26 09:44:15 -04:00
Paul Kompfner
47f5ca6265 Update Gemini adapter to be able to handle LLMSpecificMessages containing Google-formatted messages 2025-08-26 09:44:15 -04:00
Paul Kompfner
2eddb6ffda [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Remove outdated comment
2025-08-26 09:44:15 -04:00
Paul Kompfner
560a6f2247 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Make `LLMContext.add_audio_frames_message()` respect the OpenAI standard format
2025-08-26 09:44:15 -04:00
Paul Kompfner
59ecb19000 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add support for LLM-specific messages in the universal `LLMContext`, to enable using LLM-specific functionality while still using the universal LLM context
2025-08-26 09:44:15 -04:00
Paul Kompfner
cfb094b3c8 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Make it so that tools in `LLMContext` are guaranteed to be either a `ToolsSchema` or `NOT_GIVEN`
2025-08-26 09:44:15 -04:00
Paul Kompfner
1f7e8e001b [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Update some types to also allow for universal `LLMContext`
2025-08-26 09:44:15 -04:00
Paul Kompfner
688b136141 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add to Google LLM service support for universal LLM context
2025-08-26 09:44:15 -04:00
Paul Kompfner
809c4c1bc5 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add to OpenAI LLM service support for universal LLM context
2025-08-26 09:44:15 -04:00
Paul Kompfner
81ca5e6601 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Formatting fix + dead import cleanup
2025-08-26 09:44:15 -04:00
Paul Kompfner
ebc49d2252 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Add a "universal" alias for `OpenAILLMContextAssistantTimestampFrame`: `LLMContextAssistantTimestampFrame`
2025-08-26 09:44:15 -04:00
Paul Kompfner
ff8d158e18 [WIP] Universal (LLM-agnostic) context machinery to support runtime LLM switching.
- Added universal `LLMContext` and associated context aggregators.
2025-08-26 09:44:15 -04:00
Aleix Conchillo Flaqué
37980b0854 Merge pull request #2504 from pipecat-ai/aleix/cartesia-fix-timeout-reconnection
CartesiaTTSService: reconnect on Cartesia's timeout
2025-08-25 15:24:31 -07:00
Aleix Conchillo Flaqué
39ebc2c9c1 CartesiaTTSService: reconnect on Cartesia's timeout 2025-08-25 14:09:03 -07:00
Aleix Conchillo Flaqué
ab61d09ec1 Merge pull request #2502 from pipecat-ai/aleix/pipecat-0.0.81
update CHANGELOG for 0.0.81
2025-08-25 09:28:21 -07:00
Aleix Conchillo Flaqué
e4afc0a13c update CHANGELOG for 0.0.81 2025-08-25 08:22:28 -07:00
Mark Backman
dde3d2395b Merge pull request #2491 from pipecat-ai/mb/update-quickstart 2025-08-23 06:34:37 -07:00
Aleix Conchillo Flaqué
30b36c3d6e Merge pull request #2497 from pipecat-ai/aleix/pipeline-task-fix-cancellation
PipelineTask: handle cancellations gracefully
2025-08-22 22:37:12 -07:00
Mark Backman
de4dfc3ed4 Update deployment steps 2025-08-23 00:19:26 -04:00
Aleix Conchillo Flaqué
a0128516ff PipelineTask: handle cancellations gracefully 2025-08-22 19:04:31 -07:00
Aleix Conchillo Flaqué
db3b8c7325 Merge pull request #2496 from pipecat-ai/aleix/release-evals-always-provide-eval-prompt
scripts(evals): always require an eval prompt
2025-08-22 18:11:33 -07:00
Aleix Conchillo Flaqué
9273ec0f25 scripts(evals): always require an eval prompt 2025-08-22 16:57:47 -07:00
Mark Backman
8dfa1187be Merge pull request #2402 from pipecat-ai/mb/voicemail-detection
Add voicemail detection
2025-08-22 14:51:13 -07:00
Mark Backman
e17fd580c6 Update README 2025-08-22 15:56:56 -04:00
mattie ruth backman
3e3d50a855 Fix issue with request images from the camera introduced in smallwebrtctransport 2025-08-22 15:02:33 -04:00
Mark Backman
402661ae03 Prevent user speaking frames from entering the classifier branch after a conversation is detected 2025-08-22 14:09:45 -04:00
Mark Backman
69c6a95b8a Simplify frames in the NotifierGate 2025-08-22 14:09:45 -04:00
Mark Backman
4d49210a73 Rename system_prompt to custom_system_prompt; improve dev ex for classification prompt requirements 2025-08-22 14:09:45 -04:00
Aleix Conchillo Flaqué
5f8a22ef2f Merge pull request #2493 from pipecat-ai/aleix/runner-task-asyncio-cancellation
PipelineRunner/PipelineTask: fix asyncio task cancellation
2025-08-22 09:13:58 -07:00
Aleix Conchillo Flaqué
606ad0826a Merge pull request #2492 from pipecat-ai/aleix/wait-for-task-deprecated
FrameProcessor: wait_for_task is now deprecated
2025-08-22 09:13:34 -07:00
Mark Backman
57028255ee Update changelog, mention text LLMs only 2025-08-22 12:12:17 -04:00
Mark Backman
87ebbab758 Only set/clear voicemail_event when voicemail is detected 2025-08-22 12:12:17 -04:00
Mark Backman
bd401e8d6f Rename TTSBuffer to TTSGate 2025-08-22 12:12:17 -04:00
Mark Backman
f0dfab23e7 Cleanup 2025-08-22 12:12:17 -04:00
Mark Backman
fbc907c371 Change path to extensions 2025-08-22 12:12:17 -04:00
Mark Backman
40b5ef485d Add base NotifierGate class and ClassifierGate, ConversationGate subclasses 2025-08-22 12:12:17 -04:00
Mark Backman
b30af3e155 Tests specify USER_SPEAKS_FIRST or BOT_SPEAKS_FIRST 2025-08-22 12:12:17 -04:00
Mark Backman
446bb5cddf Refactor callback to event 2025-08-22 12:12:17 -04:00
Mark Backman
1c1ee94074 Add 44 to evals, update evals to support user speaking first 2025-08-22 12:12:17 -04:00
Mark Backman
ac30083b45 Add CHANGELOG entry 2025-08-22 12:12:17 -04:00
Mark Backman
ce579d4266 Make on_voicemail_detected callback required, cleanup logging 2025-08-22 12:12:17 -04:00
Mark Backman
5a07b30c7a Class name changes, add TTSStarted/StoppedFrame to the TTSBuffer 2025-08-22 12:12:17 -04:00
Mark Backman
9da33f3897 Handle multiple user inputs from the user when a voicemail is detected; add a configurable timeout to emitting the callback 2025-08-22 12:12:17 -04:00
Mark Backman
5ca82ec61e Final docstrings, comments, and cleanup 2025-08-22 12:12:17 -04:00
Mark Backman
0067c7df47 Add aggregation to classifier LLM output and validate prompt 2025-08-22 12:12:17 -04:00
Mark Backman
ab03db5b0c Updated prompt, add custom system_prompt input 2025-08-22 12:12:17 -04:00
Mark Backman
238d6bf9ab Add buffering logic 2025-08-22 12:12:17 -04:00
Mark Backman
90ae85bab2 More updates—added new voicemail module 2025-08-22 12:12:17 -04:00
Mark Backman
29e09b2053 POC demo in progress 2025-08-22 12:12:17 -04:00
mattie ruth backman
bad9977e8c PR feedback and more explicit about only supporting exporting 1 video 2025-08-22 11:24:22 -04:00
mattie ruth backman
b987579d54 update smallWebRTC screen support to support the utils format for listening to screenshares 2025-08-22 11:24:22 -04:00
mattie ruth backman
40f1f4ff11 Add support to smallWebRTCTransport for receiving screenshare videos 2025-08-22 11:24:22 -04:00
Aleix Conchillo Flaqué
a3ad31d0f6 README: recommended python version is 3.12 2025-08-21 23:50:00 -07:00
Aleix Conchillo Flaqué
8044c4170d PipelineRunner/PipelineTask: fix asyncio task cancellation 2025-08-21 23:50:00 -07:00
Aleix Conchillo Flaqué
bc51e7abc6 FrameProcessor: wait_for_task is now deprecated 2025-08-21 21:17:47 -07:00
Aleix Conchillo Flaqué
256ecf4d71 Merge pull request #2490 from pipecat-ai/aleix/speechmatics-exceptions
Speechmatics exception handling
2025-08-21 19:48:43 -07:00
Aleix Conchillo Flaqué
c16969c4f5 Merge pull request #2489 from pipecat-ai/aleix/daily-python-0.19.7
pyproject: update daily-python to 0.19.7
2025-08-21 19:48:31 -07:00
Mark Backman
8ef64d8c8d Update quickstart, make it deployable 2025-08-21 22:32:34 -04:00
Aleix Conchillo Flaqué
4947d08733 GladiaSTTService: update loggin levels 2025-08-21 18:42:23 -07:00
Aleix Conchillo Flaqué
b61846534d SpeechmaticsSTTService: improve exception handling and loggin 2025-08-21 18:42:23 -07:00
Aleix Conchillo Flaqué
8f01cd220a pyproject: update daily-python to 0.19.7 2025-08-21 18:40:01 -07:00
Aleix Conchillo Flaqué
3abaaf80e0 Merge pull request #2487 from pipecat-ai/aleix/watchdog-timers-removal
remove watchdog timers and specific asyncio implementations
2025-08-21 18:37:35 -07:00
Aleix Conchillo Flaqué
13890fa021 github(tests): use python 3.12 to run unit tests/coverage 2025-08-21 18:09:56 -07:00
Aleix Conchillo Flaqué
802af28888 update pytest-asyncio to 1.1.0 2025-08-21 18:09:56 -07:00
Aleix Conchillo Flaqué
24a628c85e remove watchdog timers and specific asyncio implementations
Watchdog timers have been removed. They were introduced in 0.0.72 to help
diagnose pipeline freezes. Unfortunately, they proved ineffective since they
required developers to use Pipecat-specific queues, iterators, and events to
correctly reset the timer, which limited their usefulness and added friction.
2025-08-21 18:09:56 -07:00
Mark Backman
ddab95835b Merge pull request #2474 from pipecat-ai/mb/add-frames-pipeline-idle
Add UserStarted/StoppedSpeakingFrames to idle_timeout_frames
2025-08-21 03:45:46 -07:00
Mark Backman
cb13f4b4cb Add user speaking and transcription frames to idle_timeout_frames 2025-08-21 06:43:10 -04:00
Aleix Conchillo Flaqué
4793277d34 Merge pull request #2480 from pipecat-ai/aleix/replace-asyncio-waitfor
replace asyncio.wait_for for wait_for2.wait_for
2025-08-20 17:43:32 -07:00
Aleix Conchillo Flaqué
28c729cc36 replace asyncio.wait_for for wait_for2.wait_for 2025-08-20 15:26:57 -07:00
Aleix Conchillo Flaqué
4d07c7b77c Merge pull request #2479 from pipecat-ai/aleix/simplify-dtmf-aggregator
DTMFAggregator: no need for interruption task
2025-08-20 15:15:35 -07:00
Aleix Conchillo Flaqué
4ff0567025 BaseObject: allow keyword arguments 2025-08-20 15:14:31 -07:00
Aleix Conchillo Flaqué
1377dec01b DTMFAggregator: no need for interruption task
Now that system frames are queued there's no need to have an additional task to
push a `BotInterruptionFrame`.
2025-08-20 14:35:04 -07:00
Aleix Conchillo Flaqué
42f4d73a63 Merge pull request #2478 from pipecat-ai/aleix/fix-wait-for2-import
timeout: fix wait_for2 import
2025-08-20 14:29:19 -07:00
Aleix Conchillo Flaqué
f1c1ebf852 timeout: fix wait_for2 import 2025-08-20 14:24:16 -07:00
Aleix Conchillo Flaqué
eb6d43f6cb Merge pull request #2476 from pipecat-ai/aleix/add-asyncio-timeout
implement custom asyncio.wait_for()
2025-08-20 14:20:22 -07:00
Aleix Conchillo Flaqué
f387776985 add custom asyncio.wait_for()
This patch uses `wait_for2` package to implement `asyncio.wait_for()` for
Python < 3.12.

In Python 3.12, `asyncio.wait_for()` is implemented in terms of
`asyncio.timeout()` which fixed a bunch of issues. However, this was never
backported (because of the lack of `async.timeout()`) and there are still many
remainig issues, specially in Python 3.10, in `async.wait_for()`.

See https://github.com/python/cpython/pull/98518
2025-08-20 14:09:05 -07:00
Aleix Conchillo Flaqué
5286591826 Merge pull request #2464 from pipecat-ai/aleix/frame-processor-updates
various frame processor updates
2025-08-20 10:11:49 -07:00
Aleix Conchillo Flaqué
6831e63ec9 PipelineTask: use PipelineSource/PipelineSink and remove tasks 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
12bcb7db64 ParallelPipeline: use PipelineSource/PipelineSink and remove tasks 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
1b48b1d860 Pipeline: allow passing user source and sink processors 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
d161e2767f FrameProcessor: allow pausing/resuming system frames 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
4e3af00b6d tests: try to use default SleepFrame time 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
4015aedb86 tests: fix unit tests 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
75a6ee839b BaseObserver: added new on_process_frame 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
13ce02c896 FrameProcessor: add new entry_processors() method 2025-08-20 10:08:54 -07:00
Aleix Conchillo Flaqué
2fd5885dc3 pipeline: implement processors property 2025-08-20 07:40:21 -07:00
Aleix Conchillo Flaqué
d743586bfb BasePipeline: move processors_with_metrics() to FrameProcessor 2025-08-20 07:40:21 -07:00
Aleix Conchillo Flaqué
8051017895 pipeline: wrap with pipelines, use direct mode and reduce tasks 2025-08-20 07:40:21 -07:00
Aleix Conchillo Flaqué
dc7bf98ce5 Pipeline: improve performance by using direct mode 2025-08-20 07:40:21 -07:00
Aleix Conchillo Flaqué
609a43a191 FrameProcessor: added processors/next/previous properties 2025-08-20 07:40:19 -07:00
Aleix Conchillo Flaqué
4fb04422d9 FrameProcessor: remove unused set_parent/get_parent 2025-08-20 07:40:02 -07:00
Mark Backman
2f74a7e674 Merge pull request #2469 from pipecat-ai/mb/11labs-text-normalization
Add apply_text_normalization to ElevenLabs TTS services
2025-08-19 18:21:33 -07:00
Mark Backman
5205f56087 Add apply_text_normalization to ElevenLabs TTS services 2025-08-19 21:19:00 -04:00
Mark Backman
694c792af3 Merge pull request #2470 from pipecat-ai/mb/11labs-settings-reconnect
Update ElevenLabsTTSService: update runtime configuration
2025-08-19 18:18:14 -07:00
TheNotary
48b3ad8f8f adds support for creating InterimTranscriptFrames for Azure speech services 2025-08-19 17:00:42 -05:00
Mark Backman
406e82a842 Merge pull request #2438 from pipecat-ai/mb/delete-old-docs
Remove stale docs
2025-08-19 12:22:54 -07:00
Mark Backman
837de5f893 Merge pull request #2468 from pipecat-ai/mb/fix-mistral-docs-errors
Fix Mistral docstrings build errors
2025-08-19 12:22:26 -07:00
Mark Backman
10b9b1da2f Merge pull request #2471 from pipecat-ai/mb/add-13j
Add foundational 13j for Azure STT
2025-08-19 12:10:03 -07:00
Mark Backman
7854a2ec83 Add foundational 13j for Azure STT 2025-08-19 14:36:31 -04:00
Mark Backman
ac7c69078f Merge pull request #2442 from pipecat-ai/mb/retry-completion
retry_on_timeout: Anthropic, AWS Bedrock
2025-08-19 11:23:43 -07:00
Mark Backman
c9b4356ea6 Update changelog 2025-08-19 14:21:18 -04:00
Mark Backman
b3e4421191 Add retry_on_timeout to AWSBedrockLLMService 2025-08-19 14:20:35 -04:00
Mark Backman
84058c3948 Add retry_on_timeout to AnthropicLLMService 2025-08-19 14:20:35 -04:00
Mark Backman
aebc781419 Update ElevenLabsTTSService to update when voice_settings change 2025-08-19 13:51:10 -04:00
Mark Backman
4160446f4c Update ElevenLabsTTSService: reconnect on model and language changes 2025-08-19 11:32:54 -04:00
Mark Backman
05a14af184 Fix Mistral docstrings build errors 2025-08-19 10:31:03 -04:00
Filipi da Silva Fuchter
89d2ef2bde Merge pull request #2465 from pipecat-ai/filipi/heygen_changing_log_level
Changing heygen log level to trace.
2025-08-19 07:50:11 -03:00
Filipi Fuchter
f550015efb Changing heygen log level to trace. 2025-08-18 18:00:25 -03:00
Abhishek
8bbdc7c8d1 Only set last_frame_time when handling OutputAudioRawFrame
We don't want to set `last_frame_time` on other frames like `HeartBeatFrame`, `LLMGeneratedTextFrame`, `InterruptionFrames` so that we can calculate `diff_time` and compare it against `vad_stop_secs` properly
2025-08-16 16:25:14 +05:30
Mark Backman
8fa44863fb Merge pull request #2455 from pipecat-ai/vp-log-line
log: add Disconnected from ElevenLabs debug log
2025-08-15 14:12:28 -07:00
vipyne
088cb56922 log: add Disconnected from ElevenLabs debug log 2025-08-15 15:05:07 -05:00
Aleix Conchillo Flaqué
a789e5feea Merge pull request #2451 from pipecat-ai/aleix/audio-buffer-processor-overlap
AudioBufferProcessor: fix overlap when buffer size is set
2025-08-14 15:31:50 -07:00
Aleix Conchillo Flaqué
16ca44131c Merge pull request #2452 from pipecat-ai/aleix/runner-daily-direct-handlesigint
Runner: set handle_sigint to True for Daily direct
2025-08-14 15:25:05 -07:00
Mark Backman
418860cf26 Merge pull request #2450 from pipecat-ai/mb/fix-openai-changelog-entry
fix: Move OpenAI retry changelog entry to the correct release
2025-08-14 15:23:00 -07:00
Aleix Conchillo Flaqué
e2fc8b3dce Runner: set handle_sigint to True for Daily direct 2025-08-14 14:55:52 -07:00
Aleix Conchillo Flaqué
8b641089f8 AudioBufferProcessor: fix overlap when buffer size is set 2025-08-14 14:44:08 -07:00
Mark Backman
d36ed755ce fix: Move OpenAI retry changelog entry to the correct release 2025-08-14 17:34:35 -04:00
Mark Backman
7aaf64fe55 Merge pull request #2447 from pipecat-ai/mb/update-foundational-readme
Improve the foundational example README
2025-08-14 09:51:01 -07:00
Mark Backman
5f52008974 Improve the foundational example README 2025-08-14 11:29:04 -04:00
Mark Backman
d520677b23 Merge pull request #2408 from pipecat-ai/mb/add-mistral-llm
Add MistralLLMService
2025-08-14 08:19:18 -07:00
Mark Backman
42bd1e9d40 Add Mistral to README and pyproject.toml 2025-08-14 11:15:52 -04:00
Mark Backman
7f0494aa04 Override build_chat_completion_params for Mistral 2025-08-14 10:32:18 -04:00
Mark Backman
b7ae2989ac Add foundational 14w-function-calling.py 2025-08-14 10:00:46 -04:00
Mark Backman
2b2b0f8121 Add MistralLLMService 2025-08-14 09:57:14 -04:00
Mark Backman
5ca33a2b00 Merge pull request #2445 from pipecat-ai/mb/fix-changelog-asyncai
fix: Changelog for Async AI bugfix
2025-08-14 06:48:08 -07:00
Mark Backman
938dcb613d fix: Changelog for Async AI bugfix 2025-08-14 09:13:03 -04:00
Mark Backman
bc748cf9d0 Merge pull request #2444 from ashotbagh/fix/asyncai-force-flush
fix(asyncai): force flush WS TTS to eliminate stalls
2025-08-14 06:10:16 -07:00
Ashot
3b55d16a49 fix(asyncai): force flush WS TTS to eliminate stalls 2025-08-14 16:34:34 +04:00
Mark Backman
d7f31e0cbd Merge pull request #2387 from pipecat-ai/mb/retry-chat-completion
Retry chat completions for OpenAILLMService and its subclasses
2025-08-13 14:39:40 -07:00
Mark Backman
c662a2d820 Merge pull request #2437 from pipecat-ai/mb/19-english
Foundational 19: Respond in English
2025-08-13 11:57:24 -07:00
Mark Backman
2c220ca54e Remove stale docs 2025-08-13 14:11:41 -04:00
Mark Backman
89f0ff17c0 Merge pull request #2430 from pipecat-ai/aleix/pipecat-0.0.80
update CHANGELOG for 0.0.80
2025-08-13 09:41:43 -07:00
Mark Backman
b5465364fa Foundational 19: Respond in English 2025-08-13 12:37:13 -04:00
Aleix Conchillo Flaqué
c024eb7b8c update CHANGELOG for 0.0.80 2025-08-13 11:46:24 -04:00
Mark Backman
608570e89d Merge pull request #2433 from pipecat-ai/mb/openai-realtime-text-modality
fix: Add text support to OpenAIRealtimeBetaLLMService
2025-08-13 08:41:33 -07:00
Mark Backman
3ad61a8a04 Remove stray - in changelog 2025-08-13 11:39:59 -04:00
Mark Backman
4c4bae2db6 Remove unnessecary messages from 19 and 19b examples 2025-08-13 11:39:59 -04:00
Mark Backman
901b6b5913 Add foundational 19b 2025-08-13 11:37:38 -04:00
Mark Backman
71cd0f1c87 fix: Add text support to OpenAIRealtimeBetaLLMService 2025-08-13 11:37:36 -04:00
Filipi da Silva Fuchter
a2a419e6db Merge pull request #2435 from pipecat-ai/filipi/small_webrtc_end_pipeline
Fixed an issue where `SmallWebRTCTransport` ended before TTS finished.
2025-08-13 11:58:33 -03:00
Filipi Fuchter
bbbbdc459a Fixed an issue where SmallWebRTCTransport ended before TTS finished. 2025-08-13 11:46:51 -03:00
Mark Backman
d203528dad Merge pull request #2333 from yohan-altrium/fix/2277-azure-tts-ssml-reserved-characters
Fixes 2277 - SSML reserved characters causes Azure TTS to fail
2025-08-13 06:27:30 -07:00
Yohan Liyanage
4bcca7956e Refactors the code based on PR comments and adds the relevant changelog entry. 2025-08-13 16:34:33 +05:30
Aleix Conchillo Flaqué
68a4cf4c68 Merge pull request #2427 from pipecat-ai/aleix/base-watchdog-priority-queue
WatchdogPriorityQueue: this is now a base class
2025-08-12 18:25:59 -07:00
Aleix Conchillo Flaqué
0508ddddfb WatchdogPriorityQueue: fix watchdog sentinel insertion
We now force each inserted item in the priority queue to be a tuple and the
actual value to be last in the tuple. All the previous values in the tuple also
need to be numeric.
2025-08-12 17:40:58 -07:00
Mark Backman
8714c9137f Code review fixes 2025-08-12 17:49:13 -04:00
Mark Backman
4c029fcfa7 Update OpenAILLMService subclasses to use the new build_chat_completion_params function 2025-08-12 17:48:51 -04:00
Mark Backman
5c86f8e687 Add timeout/retry logic and refactor parameter building in BaseOpenAILLMService
- Add timeout (default 5.0s) and retry_on_timeout parameters to constructor
- Implement timeout/retry logic in get_chat_completions using asyncio.wait_for
- Extract build_chat_completion_params() as public method for subclass customization
2025-08-12 17:48:51 -04:00
Mark Backman
54a4d8a9f8 Merge pull request #2422 from thsunkid/thu/fix-set-lang-in-base-whisper
Fix: assigns string code instead of Language enum to BaseWhisperSTTService._language
2025-08-12 11:57:46 -07:00
Mark Backman
38af514d95 Merge pull request #2407 from pipecat-ai/mb/add-gemini-tts
Add GeminiTTSService
2025-08-12 11:56:45 -07:00
Aleix Conchillo Flaqué
6aa80c0b8e Merge pull request #2424 from pipecat-ai/aleix/system-frame-queues-fix
FrameProcessor: fix race condition on FrameProcessorQueue
2025-08-12 11:56:00 -07:00
Mark Backman
e720573e60 Added 07n-interruptible-gemini 2025-08-12 14:54:49 -04:00
Mark Backman
541a43905b Add GeminiTTSService 2025-08-12 14:52:20 -04:00
Aleix Conchillo Flaqué
707df913cd FrameProcessor: fix race condition on FrameProcessorQueue
We need to increment the counters before the await otherwise we could go to a
different task that could add an item with the same counter.

Also, we need to handle non-frame items as well.
2025-08-12 11:48:22 -07:00
Aleix Conchillo Flaqué
3f3d757581 tests: added WatchdogQueue and WatchdogPriorityQueue unit tests 2025-08-12 11:48:22 -07:00
Aleix Conchillo Flaqué
7c781ce816 WatchdogPriorityQueue: make WatchdogPriorityCancelSentinel public 2025-08-12 11:34:31 -07:00
Aleix Conchillo Flaqué
f3efc9da00 WatchdogQueue: make WatchdogQueueCancelSentinel public 2025-08-12 11:34:31 -07:00
Mark Backman
827a70104d Merge pull request #2425 from pipecat-ai/mb/runner-add-exotel
Add Exotel support to the development runner
2025-08-12 10:36:54 -07:00
Mark Backman
a40327305c Add Exotel support to the development runner 2025-08-12 13:21:18 -04:00
Thu Nguyen
168af44429 Fix: assigns string code instead of Language enum to _language attr of BaseWhisperSTTService 2025-08-12 20:27:26 +07:00
Mark Backman
5f8433476c Merge pull request #2397 from gladiaio/PLA-37-GladiaSTTService-minor-tweaks
feat: add minor tweaks to GladiaSTTService
2025-08-12 04:59:40 -07:00
Fabrice Lamant
6a6fea74f5 fix: set default region to none 2025-08-12 13:31:51 +02:00
Mark Backman
91b557ecbf Merge pull request #2419 from pipecat-ai/mb/fix-lockfile-workflow 2025-08-12 03:39:54 -07:00
Mark Backman
be85291414 Merge pull request #2420 from pipecat-ai/mb/runner-handle-sigint-default 2025-08-12 03:39:29 -07:00
Fabrice Lamant
09f171b69d fix: only pass region if set 2025-08-12 12:05:38 +02:00
Aleix Conchillo Flaqué
929fd98958 Merge pull request #2416 from pipecat-ai/aleix/release-evals-vision
scripts(evals): add vision support
2025-08-11 20:08:08 -07:00
Aleix Conchillo Flaqué
1cfbfcaf11 scripts(evals): add vision support 2025-08-11 20:06:24 -07:00
Mark Backman
cd5a3c13bd Development runner: handle_sigint defaults to False 2025-08-11 22:06:56 -04:00
Mark Backman
9b871b0cc5 Update uv.lock, remove lockfile workflow, update CONTRIBUTING with dependency guidance 2025-08-11 21:39:25 -04:00
Mark Backman
0d499a8aa3 Merge pull request #2409 from pipecat-ai/mb/refactor-playht-http
Refactor PlayHTHttpTTSService to use aiohttp
2025-08-11 18:20:58 -07:00
Mark Backman
45292ab13d Merge pull request #2411 from pipecat-ai/mb/fix-websocket-service-retry
fix: WebsocketService retry logic incorrectly handling ConnectionClos…
2025-08-11 18:17:50 -07:00
Mark Backman
be6ea0dbf6 Code review feedback 2025-08-11 21:17:04 -04:00
Aleix Conchillo Flaqué
fb18ae174e Merge pull request #2417 from pipecat-ai/aleix/release-evals-15-series
scripts(evals): add multilinguag support and 15 series
2025-08-11 17:14:47 -07:00
Mark Backman
c4506523ab Refactor PlayHTHttpTTSService to use aiohttp 2025-08-11 19:58:25 -04:00
Aleix Conchillo Flaqué
b360cb31dc scripts(evals): add multilinguag support and 15 series 2025-08-11 15:21:14 -07:00
Aleix Conchillo Flaqué
07f104199c Merge pull request #2415 from pipecat-ai/aleix/moondream-2025-01-09
MoondreamService: update to revision 2025-01-09
2025-08-11 15:10:35 -07:00
Aleix Conchillo Flaqué
bc1949b4bf MoondreamService: update to revision 2025-01-09 2025-08-11 14:54:04 -07:00
Aleix Conchillo Flaqué
2035dd8b39 Merge pull request #2403 from pipecat-ai/aleix/system-frame-queue-priority-fix
FrameProcessor: fix system frame higher priorty and use a PriortyQueue
2025-08-11 13:57:57 -07:00
Aleix Conchillo Flaqué
24c8189327 Merge pull request #2405 from pipecat-ai/aleix/frame-processor-direct-mode
FrameProcessor: introduce direct mode
2025-08-11 13:57:34 -07:00
Mark Backman
998ac32627 Merge pull request #2413 from captaincaius/fix-stt-mute-filter-vad-frames-20250810
Add VADUserStartSpeakingFrame VADUserStopSpeakingFrame to STTMuteFilter (fix #2412)
2025-08-11 13:54:34 -07:00
Aleix Conchillo Flaqué
50645c1c4f README: recommend python 3.11-3.12
Python 3.11 has significant performance improvements compared to 3.10 which
makes Pipecat's asyncio heavy use  specially better.
2025-08-11 13:53:08 -07:00
Aleix Conchillo Flaqué
8ce29ee8f2 FrameProcessor: fix system frame higher priorty and use a PriortyQueue 2025-08-11 13:53:08 -07:00
Captain Caius
7b8aeef4cc update changelog 2025-08-11 12:45:54 -07:00
Aleix Conchillo Flaqué
6a24457f0e FrameProcessor: introduce direct mode
Direct mode avoids creating internal queues and tasks and processes frames right
away. This might be useful for some very simple processors.
2025-08-11 09:26:31 -07:00
Aleix Conchillo Flaqué
2c01c2b5b3 Merge pull request #2404 from pipecat-ai/aleix/examples-22-simplify-main-pipeline
examples(foundational): update 22 series with simple main pipelines
2025-08-11 09:14:39 -07:00
Aleix Conchillo Flaqué
1c2e114fa2 examples(foundational): update 22 series with simple main pipelines 2025-08-11 09:13:09 -07:00
Filipi da Silva Fuchter
0f137e36c2 Merge pull request #2399 from pipecat-ai/filipi/heygen_latency
Improving the latency of the `HeyGenVideoService`.
2025-08-11 09:13:10 -03:00
Filipi Fuchter
b7f12a96f1 Improving the latency of the HeyGenVideoService. 2025-08-11 09:11:17 -03:00
Filipi da Silva Fuchter
3331f71e17 Merge pull request #2398 from pipecat-ai/filipi/ttfb_metrics_video_services
Added TTFB metrics for `HeyGenVideoService` and `TavusVideoService`.
2025-08-11 09:09:27 -03:00
Filipi Fuchter
55d200e2d1 Added TTFB metrics for HeyGenVideoService and TavusVideoService. 2025-08-11 09:07:21 -03:00
Captain Caius
3fae00e067 Add VADUserStartSpeakingFrame VADUserStopSpeakingFrame to STTMuteFilter 2025-08-10 19:35:04 -07:00
Mark Backman
78cdefd191 Merge pull request #2410 from smokyabdulrahman/issue-2373
Support endpoint_id for AzureSTTService
2025-08-10 16:43:29 -07:00
Mark Backman
42502a4f3b fix: WebsocketService retry logic incorrectly handling ConnectionClosedOK exception 2025-08-10 19:35:05 -04:00
Abdulrahman Alrahma
fc67cc3302 Support endpoint_id for AzureSTTService 2025-08-10 22:24:47 +01:00
Aleix Conchillo Flaqué
241ab19228 update uv.lock with numba dependency 2025-08-08 15:12:55 -07:00
Mark Backman
c08e8ec8fb Merge pull request #2391 from pipecat-ai/mb/readme-local-dev
Update README with local dev setup for contributors
2025-08-08 11:15:58 -07:00
Mark Backman
eb9bc9644e Merge pull request #2400 from pipecat-ai/mb/pin-numba-0.61.2
fix: pin numba to >=0.61.2
2025-08-08 11:15:22 -07:00
Mark Backman
3a306dae90 fix: pin numba to >=0.61.2 2025-08-08 10:52:47 -04:00
Fabrice Lamant
e503ea7466 feat: add minor tweaks to GladiaSTTService 2025-08-08 10:21:52 +02:00
Mark Backman
c42cc8254f Update README with local dev setup for contributors 2025-08-07 22:07:35 -04:00
Aleix Conchillo Flaqué
a8e21f7d5d Merge pull request #2395 from pipecat-ai/aleix/examples-15-inherit-parallel-pipeline
examples(foundational): move 15/15a logic into its own processor
2025-08-07 17:59:28 -07:00
Aleix Conchillo Flaqué
c6ef8de578 scripts(evals): fix 14v-function-calling-openai.py 2025-08-07 17:57:47 -07:00
Aleix Conchillo Flaqué
fc571fba42 examples(foundational): move 15/15a logic into its own processor 2025-08-07 17:57:47 -07:00
Mark Backman
0502ee2b5a Merge pull request #2394 from pipecat-ai/mb/uv-lock
Update uv.lock
2025-08-07 15:25:38 -07:00
Mark Backman
9ec047094b Update uv.lock 2025-08-07 18:24:47 -04:00
Mark Backman
d991c106c8 Merge pull request #2393 from pipecat-ai/mb/openai-dep
fix: pin openai package upper bound to <=1.99.1
2025-08-07 15:19:05 -07:00
Mark Backman
312fb23c89 fix: pin openai package upper bound to <=1.99.1 2025-08-07 18:00:25 -04:00
Aleix Conchillo Flaqué
4d7f21d44e Merge pull request #2392 from pipecat-ai/aleix/avoid-using-tts-say
deprecate TTSService.say() method
2025-08-07 13:55:49 -07:00
Aleix Conchillo Flaqué
ec25d0a7c9 examples(foundational): fix 20a-persistent-context-openai 2025-08-07 13:48:32 -07:00
Aleix Conchillo Flaqué
2b8218deaa examples(foundational): use TTSSpeakFrame instead of TTSService.say() 2025-08-07 13:48:32 -07:00
Aleix Conchillo Flaqué
11119430cd TTSService: deprecate say() method 2025-08-07 13:48:32 -07:00
kompfner
9ca79232c1 Merge pull request #2380 from pipecat-ai/pk/deprecate-llm-messages-frame
Deprecate `LLMMessagesFrame`, `LLMUserResponseAggregator`, and `LLMAssistantResponseAggregator`
2025-08-07 15:13:01 -04:00
Paul Kompfner
9ea06c33f7 Bump deprecation version of LLMMessagesFrame, LLMUserResponseAggregator, and LLMAssistantResponseAggregator (the deprecation slipped past the 0.0.78 release) 2025-08-07 14:56:50 -04:00
Paul Kompfner
30a1dd202e Move deprecation of LLMMessagesFrame, LLMUserResponseAggregator, and LLMAssistantResponseAggregator into the next release in the changelog 2025-08-07 14:55:11 -04:00
Paul Kompfner
809ab0b7b6 Improve printed deprecation warning 2025-08-07 14:45:35 -04:00
Paul Kompfner
2b5db9c562 Remove redundant deprecation warning in docstring 2025-08-07 14:45:35 -04:00
Paul Kompfner
b4a886b59f Remove redundant deprecation warning in docstring 2025-08-07 14:45:35 -04:00
Paul Kompfner
07eb00722b Fix langchain unit test 2025-08-07 14:45:35 -04:00
Paul Kompfner
96652b8fba Add new deprecations to changelog 2025-08-07 14:45:30 -04:00
Paul Kompfner
df1fcf0c68 Remove unused import 2025-08-07 14:43:37 -04:00
Paul Kompfner
711f740d9e Update UserResponseAggregator to avoid using the now-deprecated LLMUserResponseAggregator 2025-08-07 14:43:37 -04:00
Paul Kompfner
a0bda98c20 Update langchain to avoid using the now-deprecated LLMMessagesFrame, LLMUserResponseAggregator, and LLMAssistantResponseAggregator 2025-08-07 14:43:37 -04:00
Paul Kompfner
1c1bae35ab Mention deprecation in docstring for LLMMessagesFrame 2025-08-07 14:43:37 -04:00
Paul Kompfner
56c52c2cf2 Deprecate LLMUserResponseAggregator and LLMAssistantResponseAggregator, which depend on the now-deprecated LLMMessagesFrame. 2025-08-07 14:43:37 -04:00
Paul Kompfner
740aee1a1a Fix an issue in AnthropicLLMContext where we would never initialize turns_above_cache_threshold if we were upgrading from an OpenAILLMContext.
I noticed this when working on 22c-natural-conversation-mixed-llms.py
2025-08-07 14:43:37 -04:00
Paul Kompfner
f0391c3280 Progress on updating foundational examples to avoid using the newly-deprecated LLMMessagesFrame.
Skipping over 07b-interruptible-langchain.py for now, as it requires deeper changes involving `LLMUserResponseAggregator` and `LLMAssistantResponseAggregator`.
2025-08-07 14:43:37 -04:00
Paul Kompfner
64e48e4660 Deprecate LLMMessagesFrame.
The same functionality can be achieved using either:
- `LLMMessagesUpdateFrame` with the desired messages, with `run_llm` set to `True`
- `OpenAILLMContextFrame` with a new context initialized with the desired messages
2025-08-07 14:43:37 -04:00
Paul Kompfner
b8147bdbbd Add missing Deepgram key to env.example 2025-08-07 14:43:37 -04:00
Aleix Conchillo Flaqué
315e45d41b Merge pull request #2389 from pipecat-ai/aleix/pipecat-0.0.78
update CHANGELOG for 0.0.78
2025-08-07 11:34:27 -07:00
Aleix Conchillo Flaqué
c057139c48 update CHANGELOG for 0.0.78 2025-08-07 11:14:54 -07:00
Mark Backman
c61e07132d Merge pull request #2390 from pipecat-ai/mb/optionally-ignore-emulated-speech
feat: Add option to ignore emulated user speech while the bot is spea…
2025-08-07 11:14:46 -07:00
Mark Backman
a5f5e418a8 feat: Add option to ignore emulated user speech while the bot is speaking 2025-08-07 14:08:11 -04:00
Mark Backman
31acfaa091 Merge pull request #2388 from pipecat-ai/14v-adding-openai-stt-tts-llm-functioncalling
14v adding OpenAI stt tts llm functioncalling
2025-08-07 10:22:35 -07:00
Mark Backman
69541c8835 Linting fix, plus update eval suite with 14v and others, tiny fix for 14m, too 2025-08-07 13:20:45 -04:00
Varun Singh
af94620839 Add OpenAI function calling example with Pipecat
Introduces a new example script demonstrating how to use OpenAI's function calling capabilities within a Pipecat pipeline. The example integrates OpenAI STT, TTS, and LLM services, registers a weather function, and sets up a pipeline for real-time audio interaction over WebRTC.
2025-08-07 13:20:45 -04:00
Filipi da Silva Fuchter
cec8a74293 Merge pull request #2386 from pipecat-ai/filipi/parallel_pipeline
Only push the StartFrame when all parallel pipelines have processed it
2025-08-07 14:20:30 -03:00
Filipi Fuchter
228a55ac1e Only push the StartFrame when all parallel pipelines have processed it. 2025-08-07 14:18:21 -03:00
Vanessa Pyne
ab9831daf0 Merge pull request #2382 from pipecat-ai/vp-trace-ignore-message
log: warning -> trace for elevenlabs tts unavailable context
2025-08-07 09:35:57 -05:00
Vanessa Pyne
e8c3f5dea6 Update src/pipecat/services/elevenlabs/tts.py
Co-authored-by: Mark Backman <mark@daily.co>
2025-08-07 09:23:33 -05:00
Mark Backman
4288b5e780 Merge pull request #2381 from pipecat-ai/aleix/runner-args-pipeline-idle-timeout
allow specifying PipelineTask idle timeout to runner arguments
2025-08-07 04:47:08 -07:00
Mark Backman
23343dd7e7 Remove idle_timeout_secs from quickstart 2025-08-07 07:44:21 -04:00
Mark Backman
88de5dd415 Merge pull request #2383 from pipecat-ai/aleix/riva-stt-iterator-exception
properly handle concurrent.futures.CancelledError
2025-08-07 04:39:56 -07:00
Mark Backman
33f87589d1 Merge pull request #2384 from pipecat-ai/aleix/release-evals-soniox-inworld-asyncai
scripts(evals): added soniox, inworld and asyncai
2025-08-07 04:35:18 -07:00
Aleix Conchillo Flaqué
7ed14ad91f scripts(evals): added soniox, inworld and asyncai 2025-08-06 23:14:50 -07:00
Aleix Conchillo Flaqué
86c6141580 DailyTransport: handle future cancellation 2025-08-06 23:03:20 -07:00
Aleix Conchillo Flaqué
c97643c797 RivaSTTService: always use WatchdogQueue 2025-08-06 23:00:03 -07:00
Aleix Conchillo Flaqué
434d346079 RivaSTTService: handle future cancellation 2025-08-06 22:59:52 -07:00
vipyne
64ae8d2394 log: warning -> trace for elevenlabs tts unavailable context 2025-08-06 22:40:47 -05:00
Aleix Conchillo Flaqué
786f24c9db examples(foundational): use RunnerArgs.pipeline_idle_timeout_secs 2025-08-06 19:38:06 -07:00
Aleix Conchillo Flaqué
38951aab56 scripts(evals): use RunnerArguments.pipeline_idle_timeout_secs 2025-08-06 19:37:29 -07:00
Aleix Conchillo Flaqué
ed8b0655a8 scripts(evals): fix runner eval cancellation
We need to call asyncio.gather() just once, not for every cancelled task.
2025-08-06 19:36:42 -07:00
Aleix Conchillo Flaqué
0b2b9f5f1b RunnerArguments: add pipeline_idle_timeout_secs 2025-08-06 19:35:40 -07:00
Filipi da Silva Fuchter
ad1841b739 Merge pull request #2377 from pipecat-ai/filipi/fast_api_freeze_issue
Fixed an issue in BaseOutputTransport where the loop could consume all CPU.
2025-08-06 14:58:36 -03:00
Mark Backman
b0c002c128 Merge pull request #2378 from pipecat-ai/mb/pyproject-compat-updates
Add new python-compatiblity workflow to check for dependency compatib…
2025-08-06 10:40:29 -07:00
Mark Backman
820176084c Add support for 3.13 by bumping min version for vllm to 0.9.0, adding support for torch and torchaudio up to the next major version 2025-08-06 13:36:01 -04:00
Mark Backman
5b7e31beff README updates for python versions 2025-08-06 13:36:01 -04:00
Mark Backman
41a22d3bf4 Add new python-compatiblity workflow to check for dependency compatibility across supported python versions 2025-08-06 13:36:01 -04:00
Filipi Fuchter
84fecabac5 Removing audio sleep from FastAPI and WebSocket server when they are not connected. 2025-08-06 14:02:51 -03:00
Filipi Fuchter
bbe01d10ef Fixed an issue in BaseOutputTransport where the loop could consume all CPU. 2025-08-06 12:42:58 -03:00
Mark Backman
4364990fd0 Merge pull request #2375 from fabrice404/gladia-region-selection
Gladia region selection
2025-08-06 07:01:24 -07:00
Fabrice Lamant
e576fa481f Add new region feature for GladiaSTTService in CHANGELOG 2025-08-06 15:31:10 +02:00
Mark Backman
ac6b59cae2 Merge pull request #2372 from pipecat-ai/mb/dotenv-dev
Wider package support for python-dotenv dev dep
2025-08-06 06:06:01 -07:00
Mark Backman
12e168e740 Wider package support for python-dotenv dev dep 2025-08-06 09:04:01 -04:00
Mark Backman
ac354f66ed Merge pull request #2371 from pipecat-ai/mb/docs-gen-with-uv
Update docs auto-generation to use uv
2025-08-06 06:02:52 -07:00
Mark Backman
eead793927 Merge pull request #2370 from pipecat-ai/mb/update-workflows-for-uv
Update workflows for uv
2025-08-06 05:54:55 -07:00
Fabrice Lamant
0594a203fc Add new region parameter to Gladia 2025-08-06 14:28:06 +02:00
Mark Backman
2337a2d92d Remove dev-requirements.txt and mentions of it 2025-08-05 21:46:50 -04:00
Mark Backman
b3e2603553 Update workflows for uv 2025-08-05 21:45:48 -04:00
Mark Backman
29229df719 Speed up builds, mocking large packages 2025-08-05 21:34:40 -04:00
Aleix Conchillo Flaqué
61f4dd2ff2 scripts(evals): fix 14e-function-calling-google 2025-08-05 17:44:45 -07:00
Mark Backman
42094fb206 Update docs auto-generation to use uv 2025-08-05 20:37:27 -04:00
Aleix Conchillo Flaqué
58c41f112a DailyRunnerArguments: make body optional (fix) 2025-08-05 16:59:36 -07:00
Aleix Conchillo Flaqué
fa55e2ca9b Merge pull request #2369 from pipecat-ai/aleix/pipeline-task-cancellation-fix
PipelineTask: always try to cancel things
2025-08-05 16:56:23 -07:00
Aleix Conchillo Flaqué
313fdc92a1 DailyRunnerArguments: make body optional 2025-08-05 16:39:18 -07:00
Aleix Conchillo Flaqué
d22d2da03d PipelineTask: always try to cancel things
In a previous commit we only cleanup things if the user run
`task.cancel()`. However, if the task finishes cleanly we were not cancelling
anything.
2025-08-05 16:24:59 -07:00
Aleix Conchillo Flaqué
de2ae9a2ec Merge pull request #2368 from pipecat-ai/aleix/release-evals-runner-args-fix
pass runner arguments to release evals
2025-08-05 16:23:32 -07:00
Aleix Conchillo Flaqué
52a6d8013c scripts(evals): pass runner arguments to run_bot() 2025-08-05 16:13:32 -07:00
Aleix Conchillo Flaqué
f14cbae9b5 DailyRunnerArguments: make token optional
DailyTransport can get a None token value.
2025-08-05 15:46:12 -07:00
Aleix Conchillo Flaqué
8fe906438a Merge pull request #2358 from pipecat-ai/aleix/system-frames-queued
system frames are now queued
2025-08-05 15:09:52 -07:00
Mark Backman
d8f4db8827 Merge pull request #2367 from richtermb/richtermb/fix-errorframe-docstring
Rename 'source' parameter to 'processor' in ErrorFrame class document…
2025-08-05 15:09:18 -07:00
Aleix Conchillo Flaqué
a5ea6e1642 FrameProcessor: system frames are now queued
System frames are now queued. Before, system frames could be generated from any
task and would not guarantee any order which was causing undesired
behavior. Also, it was possible to get into some rare recursion issues because
of the way system frames were executed (they were executed in-place, meaning
calling `push_frame()` would finish after the system frame traversed all the
pipeline). This makes system frames more deterministic.
2025-08-05 15:05:50 -07:00
richtermb
e777e78510 Rename 'source' parameter to 'processor' in ErrorFrame class documentation for clarity. 2025-08-05 15:02:00 -07:00
Aleix Conchillo Flaqué
49a5a1e375 PipelineTask: improve task cancellation 2025-08-05 14:49:23 -07:00
Aleix Conchillo Flaqué
61cb45d61b PipelineTask: also wait on CancelFrame
Before CancelFrames didn't need to be waited for because system frames were
processed in-place and therefore calling push_frame() would finalize after it
traversed all the pipeline. Now, system frames are queued so we need to wait
until CancelFrame reaches the end of the pipeline.
2025-08-05 14:49:23 -07:00
Aleix Conchillo Flaqué
6c6deb4e85 Merge pull request #2366 from pipecat-ai/aleix/run-bot-runner-arguments
add sigint/sigterm to RunnerArguments
2025-08-05 14:46:19 -07:00
Aleix Conchillo Flaqué
66ad29b2b1 example: pass RunnerArguments to run_bot()
This lets us get handle_sigint from RunnerArguments which knows where the
application is running and if SIGINT/SIGTERM should be handled or not.
2025-08-05 14:38:55 -07:00
Aleix Conchillo Flaqué
21e4f0d56d PipelineRunner: argument ordering 2025-08-05 14:38:55 -07:00
Aleix Conchillo Flaqué
627b44bac2 runner: use new RunnerArguments handle_sigint/handle_sigterm
This allow us to control applications behavior from the runner arguments, which
depen on the environment they run.
2025-08-05 14:38:55 -07:00
Aleix Conchillo Flaqué
e2a576beca RunnerArguments: add handle_sigint/handle_sigterm 2025-08-05 14:32:28 -07:00
Mark Backman
2981afb117 Merge pull request #2361 from pipecat-ai/mb/fix-changelog-simli
Fix Simli changelog entry placement
2025-08-05 14:12:38 -07:00
Mark Backman
d422c57b52 Merge pull request #2304 from pipecat-ai/mb/cartesia-cjk-lang-support
CartesiaTTSService: Add CJK lang support for word timestamps
2025-08-05 14:08:53 -07:00
Mark Backman
06d8bbd154 Fix Simli changelog entry placement 2025-08-05 17:07:58 -04:00
Mark Backman
35108afeb8 Merge pull request #2360 from pipecat-ai/mb/add-heygen-readme
Add HeyGen to the README page
2025-08-05 14:05:33 -07:00
Mark Backman
a0e2a2754a Merge pull request #2327 from richtermb/richtermb/push-more-error-frames
Add source parameter to ErrorFrame and set it in FrameProcessor. Upda…
2025-08-05 14:04:52 -07:00
Mark Backman
b8d620c8bb Merge pull request #2362 from pipecat-ai/mb/aws-stt-languages
AWSTranscribeSTTService add support for new languages
2025-08-05 14:00:50 -07:00
Mark Backman
f26bbe4092 Merge pull request #2363 from pipecat-ai/mb/update-14p
Update 14p, add 14p to evals, add Google creds to env.example
2025-08-05 14:00:13 -07:00
Mark Backman
52cb23f8d5 Merge pull request #2364 from pipecat-ai/mb/11labs-default-model
ElevenLabs TTS services: revert to Turbo v2.5 as default model
2025-08-05 13:59:59 -07:00
Filipi da Silva Fuchter
17e7f8a2cd Merge pull request #2352 from pipecat-ai/filipi/webrtc_audio_frame
Implementing if the bot it is speaking or not based on the SpeechOutputAudioRawFrame
2025-08-05 17:26:44 -03:00
richtermb
efddc4732c Refactor ErrorFrame: rename source field to processor for clarity and update related references in FrameProcessor. 2025-08-05 13:25:08 -07:00
richtermb
4476a76ad7 Merge branch 'main' into richtermb/push-more-error-frames 2025-08-05 13:23:24 -07:00
Filipi Fuchter
64592b274b Fixed an issue where BotStartedSpeakingFrame and BotStoppedSpeakingFrame
were not emitted when using `TavusVideoService` or `HeyGenVideoService`.
2025-08-05 17:11:34 -03:00
Aleix Conchillo Flaqué
95c661bdaa Merge pull request #2365 from pipecat-ai/aleix/update-release-evals-for-new-runner
scripts(evals): update to use new runner function
2025-08-05 13:07:57 -07:00
Aleix Conchillo Flaqué
5546c8e01c scripts(evals): update to use new runner function 2025-08-05 11:46:28 -07:00
Mark Backman
14e02c1b08 ElevenLabs TTS services: revert to Turbo v2.5 as default model 2025-08-05 13:44:37 -04:00
Mark Backman
ba5a5c7187 Update 14p, add 14p to evals, add Google creds to env.example 2025-08-05 13:30:36 -04:00
Mark Backman
2378cba155 AWSTranscribeSTTService add support for new languages 2025-08-05 13:01:06 -04:00
Mark Backman
1138c92a00 Merge pull request #2217 from simliai/main
feat: Add Simli Trinity models support to pipecat
2025-08-05 09:01:20 -07:00
Antonyesk601
fb82dc8308 Update CHANGELOG.md
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-08-05 17:46:01 +02:00
Mark Backman
c8a15f30fa Add HeyGen to the README page 2025-08-05 10:54:49 -04:00
antonyesk601
72168070f1 update changelog 2025-08-05 14:18:41 +00:00
Mark Backman
50083d1144 Merge pull request #2342 from pipecat-ai/mb/runner-connect-request-body
Development runner handles body information in the RTVI connect request
2025-08-05 05:15:55 -07:00
Mark Backman
64732518c6 Development runner handles body information in the RTVI connect request 2025-08-05 07:26:34 -04:00
Mark Backman
c3d8ea210f CartesiaTTSService: Add CJK lang support for word timestamps 2025-08-05 07:17:40 -04:00
Filipi da Silva Fuchter
98ed614f63 Merge pull request #2357 from pipecat-ai/filipi/latency_observer
Added detailed latency logging to UserBotLatencyLogObserver.
2025-08-05 08:11:48 -03:00
Filipi Fuchter
e43bdff31e Added detailed latency logging to UserBotLatencyLogObserver. 2025-08-04 19:36:30 -03:00
Mark Backman
42e48381fe Merge pull request #2355 from pipecat-ai/mb/update-readme-for-uv
Update the README with uv-centric steps
2025-08-04 15:28:07 -07:00
Mark Backman
df7ba64b4a Merge pull request #2354 from pipecat-ai/mb/revert-43-inline-script
Remove inline script from foundational 43a
2025-08-04 15:27:28 -07:00
Mark Backman
ac9b2e67a7 Merge pull request #2349 from pipecat-ai/mb/runner-support-daily-url-arg
daily runner util: remove arg parsing, add auto room, token generation
2025-08-04 13:44:25 -07:00
Mark Backman
c9918607cf Merge pull request #2335 from pipecat-ai/mb/quickstart-runner-improvements
Improve quickstart logging, runner startup message
2025-08-04 13:43:42 -07:00
Mark Backman
cfda410a43 Remove foundational requirements.txt file 2025-08-04 16:38:37 -04:00
Mark Backman
c773ddf83d Update foundational examples README 2025-08-04 16:26:11 -04:00
Mark Backman
54d5ebbc20 Update the README with uv-centric steps 2025-08-04 16:11:38 -04:00
Mark Backman
35002cd727 Remove inline script from foundational 43a 2025-08-04 15:46:18 -04:00
Mark Backman
53d75faa47 Merge pull request #2330 from pipecat-ai/mb/runner-clean-proxy-name
Runner: strip protocol from proxy address
2025-08-04 10:42:16 -07:00
Mark Backman
2901dddc2b Merge pull request #2338 from pipecat-ai/mb/update-release-evals-tavus
Add Tavus, HeyGen, Simli to release-evals
2025-08-04 10:38:27 -07:00
Mark Backman
3a8d809837 Runner: strip protocol from proxy address 2025-08-04 13:38:02 -04:00
Mark Backman
1b3c2bee30 Merge pull request #2331 from pipecat-ai/mb/more-foundational
Updating more foundational examples
2025-08-04 10:37:15 -07:00
Mark Backman
69f049cb63 Merge pull request #2328 from pipecat-ai/mb/04b-example-cleanup
Align 04b livekit example with other foundational examples
2025-08-04 10:36:57 -07:00
Vanessa Pyne
96b1000e52 Merge pull request #2341 from getchannel/realtime-text
Hotfix: Correct Gemini Live API class to fix 1007 payload error.
2025-08-04 11:03:57 -05:00
Filipi da Silva Fuchter
0184a8c231 Merge pull request #2351 from pipecat-ai/filipi/tavus_transport_ready
Changed `TavusVideoService` to send audio or video frames only after the   transport is ready, preventing warning messages at startup.
2025-08-04 11:48:27 -03:00
Filipi Fuchter
c22866ed58 Mentioning the TavusVideoService fix in the changelog. 2025-08-04 11:46:24 -03:00
Filipi Fuchter
0e533d21be Only send audio and video from the tavus video service if the transport is ready. 2025-08-04 10:52:30 -03:00
Mark Backman
6f6f4c3dea Merge pull request #2348 from sam-s10s/speechmatics-stt
Fix for Speechmatics STT
2025-08-04 06:15:39 -07:00
Mark Backman
f609971637 daily runner util: remove arg parsing, add auto room, token generation 2025-08-03 21:50:44 -04:00
Mark Backman
54ff10ae86 Merge pull request #2332 from hankehly/fix-piper-tts-json-payload
Fix PiperTTSService to send TTS input as JSON object
2025-08-03 17:39:04 -07:00
hankehly
77057eb829 Fix ruff formatting 2025-08-04 08:13:16 +09:00
Mark Backman
2b1a7b840d Merge pull request #2346 from adenta/heygen-testing
me trying to get heygen working
2025-08-03 14:11:14 -07:00
Sam Sykes
e07db88bc0 Updated changelog. 2025-08-03 22:11:10 +01:00
Sam Sykes
c2282b0e73 Set user_id to "" (not None) for RTVIProcessor. 2025-08-03 22:08:22 +01:00
Andrew Denta
593bf09d8d update script 2025-08-03 17:01:27 -04:00
Sam Sykes
534ed77ebf Merge branch 'main' into speechmatics-stt 2025-08-03 21:51:35 +01:00
Andrew Denta
193299988d me trying to get heygen working 2025-08-03 13:42:07 -04:00
Aleix Conchillo Flaqué
d589bcb345 Merge pull request #2344 from pipecat-ai/aleix/daily-python-0.19.6
pyproject: update daily-python to 0.19.6
2025-08-03 10:15:15 -07:00
Aleix Conchillo Flaqué
011ebc2801 Merge pull request #2345 from pipecat-ai/aleix/task-observer-signature-performance
TaskObserver: don't inspect on_push_frame signature for every frame
2025-08-03 10:15:00 -07:00
Aleix Conchillo Flaqué
3a72e94d0c LLMService: only do handle function inspection once 2025-08-03 10:09:19 -07:00
Aleix Conchillo Flaqué
d6d39fc873 TaskObserver: don't inspect on_push_frame signature for every frame 2025-08-03 10:09:17 -07:00
Pete
258e83c904 Fix: Correct Gemini Live API text input to prevent 1007 WebSocket errors
- Restore TextInputMessage.realtimeInput structure for correct API format
- Remove invalid turnComplete message from _send_user_text method
- turnComplete is only valid for clientContent, not realtimeInput messages
- realtimeInput text completion is automatically inferred by the API

This fixes WebSocket 1007 errors caused by mixing realtimeInput and
clientContent message types in violation of the Gemini Live API contract.
2025-08-03 10:58:59 -04:00
Mark Backman
061f2086b2 Merge pull request #2343 from pipecat-ai/mb/update-pre-commit-ruff-version
Update pre-commit-config ruff version
2025-08-03 03:38:54 -07:00
Aleix Conchillo Flaqué
a1f3f51168 pyproject: update daily-python to 0.19.6 2025-08-02 20:02:22 -07:00
hankehly
2177a2b805 Remove trailing space 2025-08-03 10:34:27 +09:00
hankehly
68164415ce Format changelog entry 2025-08-03 10:26:37 +09:00
hankehly
7646599b66 Add changelog entry 2025-08-03 10:23:58 +09:00
Mark Backman
e467eaf130 Merge pull request #2334 from Designedforusers/fix/tavus-transport-daily-callbacks
fix: Add missing transcription callbacks to TavusTransport for 0.0.77 compatibility
2025-08-02 16:09:57 -07:00
Designedforusers
9d6d53629e style: Apply ruff formatting to fix line length
Fixed a line length issue in tavus.py where the on_transcription_stopped callback was exceeding the maximum line length. Split the partial() call across multiple lines for better readability and compliance with project style guidelines.
2025-08-02 18:27:09 -04:00
Mark Backman
89596cfec4 Update pre-commit-config ruff version 2025-08-02 18:06:06 -04:00
Designedforusers
5e338ecaf1 refactor: Remove redundant transcription callback methods
As suggested in PR review, removed the _on_transcription_stopped and
_on_transcription_error method definitions. Now using the consistent
partial(self._on_handle_callback, ...) pattern for these callbacks,
matching how all other callbacks are handled.

This simplifies the code while maintaining the same functionality.
2025-08-02 15:02:54 -04:00
Designedforusers
62319021f8 docs: Add changelog entry for TavusVideoService fix
Added changelog entry as requested by maintainers for the fix addressing
missing transcription callbacks in TavusVideoService.
2025-08-02 14:53:44 -04:00
Pete
cccd82a617 Refactor TextInputMessage class to replace realtimeInput with a text attribute.
This was sending a 1007 because it was wrapping RealtimeInput in the json.

- Updated the `TextInputMessage` class to directly store text input as a string.
- Modified the `from_text` class method to create an instance using the new `text` attribute.
2025-08-02 14:34:00 -04:00
Mark Backman
f552ba1f5e Merge pull request #2336 from pipecat-ai/mb/suppress-pydub-warning
Suppress pydub (cartesia dependency) SyntaxWarning
2025-08-02 10:01:05 -07:00
Mark Backman
b9a2a9b729 Add Tavus, HeyGen, Simli to release-evals 2025-08-02 09:35:06 -04:00
Mark Backman
e43b3869c3 Suppress pydub SyntaxWarning from the cartesia module 2025-08-02 08:49:59 -04:00
Mark Backman
55731df999 Improve quickstart logging, runner startup message 2025-08-02 08:40:05 -04:00
Designedforusers
3a7ea25077 fix: Add missing transcription callbacks to TavusTransport for 0.0.77 compatibility
TavusTransport was broken in Pipecat 0.0.77 due to PR #2292 adding required
callbacks (on_transcription_stopped, on_transcription_error) to DailyCallbacks.

This fix adds placeholder implementations of these callbacks to TavusTransportClient,
allowing TavusTransport to initialize properly. These callbacks are not used by
Tavus (which handles avatar video, not transcription) but are required by the
DailyCallbacks validation.

Fixes initialization error:
- 2 validation errors for DailyCallbacks
- on_transcription_stopped: Field required
- on_transcription_error: Field required
2025-08-02 05:56:46 -04:00
Yohan Liyanage
248206e234 Fixes 2277 - SSML reserved characters in LLM generated text causes Azure TTS to fail. 2025-08-02 12:49:29 +05:30
hankehly
694922f627 Fix PiperTTSService to send TTS input as JSON object 2025-08-02 15:29:16 +09:00
Mark Backman
cc9950e72d Updating more foundational examples 2025-08-01 19:58:40 -04:00
richtermb
6814c390ba Update CHANGELOG to reflect the addition of the source field in ErrorFrame for improved error tracking. 2025-08-01 14:47:57 -07:00
Richter Brzeski
c2d05ad23b Merge branch 'pipecat-ai:main' into richtermb/push-more-error-frames 2025-08-01 14:47:08 -07:00
Mark Backman
ee56d8572d Merge pull request #2329 from pipecat-ai/mb/fix-livekit-empty-audio-frames
fix: LiveKitTransport, don't push empty AudioRawFrames
2025-08-01 12:53:05 -07:00
richtermb
91568eeddc Update type hint for source in ErrorFrame to use forward declaration for improved clarity. 2025-08-01 12:52:56 -07:00
richtermb
165d6b4c1d Update CHANGELOG to include new source field in ErrorFrame for error tracking. 2025-08-01 12:25:29 -07:00
Mark Backman
1d8abe3c1c fix: LiveKitTransport, don't push empty AudioRawFrames 2025-08-01 14:57:53 -04:00
Mark Backman
a6e69d6aad Merge pull request #2325 from pipecat-ai/mb/dependency-groups
Move dev to [dependency-groups], update uv.lock
2025-08-01 11:54:21 -07:00
Mark Backman
519da9cc61 Align 04b livekit example with other foundational examples 2025-08-01 14:28:15 -04:00
richtermb
ead4e97ab5 Add source parameter to ErrorFrame and set it in FrameProcessor. Updated error handling in AnthropicLLMService and DeepgramSTTService to include ErrorFrame with source information. 2025-08-01 11:14:50 -07:00
Mark Backman
0c021378b0 Merge pull request #2326 from pipecat-ai/readme-quickstart-link
Update README.md
2025-08-01 10:45:30 -07:00
Mark Backman
e22c7e8ad5 Update README.md 2025-08-01 13:40:03 -04:00
Mark Backman
b71057bf7c Move dev to [dependency-groups], update uv.lock 2025-08-01 09:43:56 -04:00
Mark Backman
0865f6cd7d Merge pull request #2318 from pipecat-ai/mb/add-asyncio-readme
Add AsyncAI TTS to README vendor list
2025-08-01 06:11:13 -07:00
Mark Backman
610b1ab065 Merge pull request #2319 from pipecat-ai/mb/use-new-runner
Update foundational examples to use new runner
2025-08-01 06:11:03 -07:00
Mark Backman
3a2a226668 Merge pull request #2320 from pipecat-ai/mb/uv-lock-init
Add initial uv.lock file
2025-08-01 06:07:53 -07:00
Mark Backman
8e4b7352fd Merge pull request #2321 from pipecat-ai/mb/dev-requirements
Add dev to optional-dependencies
2025-08-01 06:02:58 -07:00
Mark Backman
637d372fe4 Add dev to optional-dependencies 2025-07-31 23:39:23 -04:00
Mark Backman
ac15fe8ae4 Add workflow to update lockfile with pyproject.toml changes 2025-07-31 23:08:21 -04:00
Mark Backman
07239c0b8b Add initial uv.lock file 2025-07-31 22:46:44 -04:00
Mark Backman
367b2fbe3c Update requirements.txt 2025-07-31 22:12:57 -04:00
Mark Backman
f1b1d5b130 Update foundational examples to use the development runner 2025-07-31 22:11:32 -04:00
Mark Backman
ff45b77fdf Remove examples runner 2025-07-31 21:22:04 -04:00
Mark Backman
e522b7ae96 Add AsyncAI TTS to README vendor list 2025-07-31 19:33:37 -04:00
Mark Backman
b8eef4f93b Merge pull request #2314 from pipecat-ai/mb/sync-quickstart-example
Add workflow to sync quickstart to pipecat-quickstart repo
2025-07-31 15:34:26 -07:00
Mark Backman
dcc205996a Merge pull request #2317 from pipecat-ai/mb/release-prep-0.0.77
Changelog update for 0.0.77
2025-07-31 15:34:02 -07:00
Mark Backman
9f61af4d1b Changelog update for 0.0.77 2025-07-31 18:19:05 -04:00
Sam Sykes
e8faf28e6a Doc fix for incorrect argument name. 2025-07-31 22:30:54 +01:00
Filipi da Silva Fuchter
40d53b3d84 Merge pull request #2316 from sam-s10s/speechmatics-stt
Updated to SpeechmaticsSTTService
2025-07-31 18:28:16 -03:00
Sam Sykes
7c223a86c2 Fix to missing deprecated attribute enable_speaker_diarization. 2025-07-31 22:25:46 +01:00
Sam Sykes
2d3f61aa07 Updated Speechmatics Plugin (#2225)
Changes
Split out module attributes to make engine settings clearer
Removed internal audio buffer to use latest Speechmatics python SDK (0.4.0)
Use diarization for improved VAD in multi-speaker situations
Support custom dictionary / vocabulary with attributes
Deprecated attributes superseded by re-organised attributes

Diarization Enhancements
Focus on specific speakers (using speaker labels)
Ignore specific speakers (using speaker labels)
Separate transcription formats for active and inactive speakers
Support for known speakers
2025-07-31 17:51:38 -03:00
Mark Backman
e05a47744d Merge pull request #2311 from pipecat-ai/mb/quickstart-fixups
Set quickstart min pipecat-ai version to 0.0.77, remove non-quickstart examples
2025-07-31 13:42:10 -07:00
Aleix Conchillo Flaqué
6ffaab2b93 CHANGELOG cosmetics 2025-07-31 13:39:37 -07:00
Aleix Conchillo Flaqué
c2d8844903 Merge pull request #2312 from pipecat-ai/aleix/srhinos/main
Enable Interruption Support for LLMUserResponseAggregator
2025-07-31 13:30:57 -07:00
Mark Backman
e8caba7723 Add workflow to sync quickstart to pipecat-quickstart repo 2025-07-31 15:56:18 -04:00
Mark Backman
df96ef7d37 Remove non-quickstart demos 2025-07-31 15:38:54 -04:00
Aleix Conchillo Flaqué
7553f670af fix formatting and update CHANGELOG 2025-07-31 10:41:11 -07:00
Mark Backman
6960f5861b Set example min pipecat-ai version to 0.0.77 due to runner requirement 2025-07-31 12:21:58 -04:00
Mark Backman
b5edbbc0ca Merge pull request #2309 from pipecat-ai/mb/remove-runner-examples
Remove examples/runner-examples
2025-07-31 09:18:22 -07:00
Aleix Conchillo Flaqué
e78d9c2c95 Merge pull request #2293 from azain47/azain47/fix-piper-tts-service
Fix Piper TTS Service
2025-07-31 08:32:17 -07:00
Vanessa Pyne
b25547a98b Merge pull request #2305 from pipecat-ai/vp-changelog-text-input
update changelog
2025-07-31 10:16:47 -05:00
Mark Backman
e80281c3c4 Remove examples/runner-examples 2025-07-31 10:59:06 -04:00
Mark Backman
d692843e5b Merge pull request #2308 from pipecat-ai/mb/change-neuphonic-url
NeuphonicTTSService: change the default url value to the global endpoint
2025-07-31 07:38:57 -07:00
Mark Backman
eaad3c5d55 NeuphonicTTSService: change the default url value to the global endpoint 2025-07-31 10:24:54 -04:00
vipyne
f2a1c66379 update changelog 2025-07-31 08:55:25 -05:00
Vanessa Pyne
af8de227bb Merge pull request #2223 from getchannel/realtime-text
Add text input handling to unify context for realtimeInput stream of GeminiMultimodalLiveService
2025-07-31 08:53:39 -05:00
Mark Backman
7cd78dd286 Merge pull request #2303 from pipecat-ai/mb/add-new-quickstart-demos
Add quickstart demos
2025-07-31 05:56:00 -07:00
Mark Backman
226b516948 Add quickstart demos 2025-07-30 22:14:10 -04:00
Mark Backman
aa85fffa57 New runner module (#2269)
* Adds pipecat.runner.run - FastAPI-based development server with automatic bot discovery

* Adds new RunnerArguments types for different transports

* Adds new runner utils for creating transports and parsing data

* Adds new Daily and LiveKit utils for setup
2025-07-30 22:02:28 -04:00
srhinos
8b97ab70ff Enable Interruption Support for LLMUserResponseAggregator 2025-07-30 20:48:31 -04:00
Filipi da Silva Fuchter
9013b2929a Merge pull request #2300 from pipecat-ai/filipi/fast_api_race_condition
Fixed a race condition in FastAPIWebsocketClient
2025-07-30 18:10:09 -03:00
Filipi Fuchter
0c6e12a9b0 Fixed a race condition in FastAPIWebsocketClient that occurred when attempting to send a message while the client was disconnecting. 2025-07-30 18:07:40 -03:00
Aleix Conchillo Flaqué
efb24071d5 Merge pull request #2301 from pipecat-ai/aleix/daily-python-0.19.5
pyproject: update daily-python to 0.19.5
2025-07-30 14:01:27 -07:00
Filipi da Silva Fuchter
318ebec67e Merge pull request #2298 from pipecat-ai/filipi/google_interruptions
Fixed an issue in GoogleLLMService where interruptions did not work when an interruption strategy was used.
2025-07-30 17:49:07 -03:00
Aleix Conchillo Flaqué
c679227aa8 pyproject: update daily-python to 0.19.5 2025-07-30 13:19:48 -07:00
Filipi Fuchter
392853f5fa Fixed an issue in GoogleLLMService where interruptions did not work when an interruption strategy was used. 2025-07-30 12:10:32 -03:00
Mark Backman
98d27caab3 Merge pull request #2296 from pipecat-ai/mb/switch-rime-voices
Added the ability to switch voices to RimeTTSService
2025-07-30 07:55:52 -07:00
Mark Backman
0fa51968bf Added the ability to switch voices to RimeTTSService 2025-07-30 10:53:14 -04:00
Mark Backman
92aee2634b Merge pull request #2291 from pipecat-ai/mb/remove-on-client-closed 2025-07-30 06:36:32 -07:00
Filipi da Silva Fuchter
bff6a93f31 Merge pull request #2150 from pipecat-ai/filipi/hey_gen
HeyGen implementation for Pipecat - HeyGenVideoService
2025-07-30 09:10:07 -03:00
Filipi Fuchter
6e921cdf45 HeyGen implementation for Pipecat - HeyGenVideoService 2025-07-30 09:07:15 -03:00
Azain.
1e2b066cf3 Fix tts.py
Update Piper TTS Service to work with the newer Piper GPL Version, that uses JSON as its payload.
2025-07-30 13:41:27 +05:30
Pete
2af3b6329d Ruff format debug 2025-07-29 17:48:11 -04:00
Pete
8ca06e5887 Add InputTextRawFrame class for handling raw text input in frames
- Introduced `InputTextRawFrame` to represent raw text input from users or programs.
- Updated `GeminiMultimodalLiveLLMService` to process `InputTextRawFrame` and send user text via the Gemini Live API's realtime input stream.
- Enhanced `_send_user_text` method documentation for clarity on its functionality and usage.
2025-07-29 17:43:14 -04:00
Mark Backman
c145a9ef13 Merge pull request #2288 from pipecat-ai/mb/stt-mute-examples
Update placement of STTMuteFilter in examples to reflect the new reco…
2025-07-29 12:10:36 -07:00
Mark Backman
b523f9a4c6 Merge pull request #2248 from ashotbagh/feat/async-tts
feat(tts): integrate Async TTS engine into pipecat
2025-07-29 12:10:11 -07:00
Mark Backman
7f184422d0 Merge branch 'main' into feat/async-tts 2025-07-29 12:06:56 -07:00
Aleix Conchillo Flaqué
fa4c3ec6bf Merge pull request #2287 from pipecat-ai/aleix/asyncio-trace-logging
utils(asyncio): use trace logging for some cancelling messages
2025-07-29 11:56:25 -07:00
Aleix Conchillo Flaqué
9fafc10844 Merge pull request #2292 from richtermb/transcription-error-callback
Add on_transcription_error callback to DailyCallbacks and handle tran…
2025-07-29 11:56:02 -07:00
richtermb
67107d02ed Refactor callback invocation for on_transcription_stopped in DailyTransportClient for improved readability 2025-07-29 11:53:41 -07:00
richtermb
c1df19982c Add on_transcription_stopped callback to DailyTransport for handling transcription stop events 2025-07-29 11:50:16 -07:00
richtermb
444b1b5b02 Add on_transcription_stopped callback to DailyCallbacks and implement handling in DailyTransport for transcription stop events 2025-07-29 11:49:28 -07:00
Mark Backman
ebfa4f2d5e Push the STTMuteFrame upstream and downstream 2025-07-29 14:37:36 -04:00
Mark Backman
e961c438e7 Update placement of STTMuteFilter in examples to reflect the new recommendation 2025-07-29 14:36:39 -04:00
richtermb
d3d36a89e2 Add _on_transcription_error method to DailyTransport for handling transcription error events 2025-07-29 10:48:50 -07:00
richtermb
fa6e5ce4a7 Add on_transcription_error callback to DailyCallbacks and handle transcription errors in DailyTransportClient 2025-07-29 10:43:18 -07:00
Mark Backman
3ffb261864 Remove use of on_client_closed event in foundational examples 2025-07-29 13:28:33 -04:00
Mark Backman
f69a02b7a7 Merge pull request #2290 from pipecat-ai/mb/remove-examples
Removed most pipecat examples, relocating to pipecat-examples repo
2025-07-29 10:02:24 -07:00
Mark Backman
f1f4aed398 Remove random Dockerfile, update README links 2025-07-29 11:41:27 -04:00
Mark Backman
414c245c92 Remove android.yaml github workflow 2025-07-29 11:34:48 -04:00
Mark Backman
3f57d94c0b Update examples README 2025-07-29 11:28:40 -04:00
Mark Backman
15e3c69ddc Removed most pipecat examples, relocating to pipecat-examples repo 2025-07-29 11:17:01 -04:00
Ashot
39b00f5269 chore: address review comments 2025-07-29 18:20:50 +04:00
Mark Backman
4c368c78c6 Merge pull request #2289 from tomoima525/tomoima525/transcription-bucket-params
Add transcription_bucket param for rest helper
2025-07-29 05:05:08 -07:00
Tomoaki Imai
6eb00a99cb update Changelog for transcription_bucket params addition 2025-07-29 20:38:12 +09:00
Tomoaki Imai
3ae8cf1916 Add transcription_bucket param for rest helper 2025-07-29 18:58:38 +09:00
Aleix Conchillo Flaqué
03e87469df utils(asyncio): use trace logging for some cancelling messages 2025-07-28 17:43:41 -07:00
Mark Backman
70255d3c81 Merge pull request #2274 from pipecat-ai/mb/remove-message-name
Remove message["name"] addition when pushing
2025-07-28 17:29:23 -07:00
Mark Backman
96a72d0647 Remove message[name] addition when pushing 2025-07-28 20:13:50 -04:00
Mark Backman
27d4910694 Merge pull request #2286 from pipecat-ai/mb/fix-transcript-processor-newline
fix: Improve TranscriptProcessor detection for transcript type
2025-07-28 17:07:43 -07:00
Mark Backman
50242f4ad8 fix: Improve TranscriptProcessor detection for transcript type 2025-07-28 19:56:36 -04:00
Mark Backman
c9dda5251c Merge pull request #2284 from pipecat-ai/mb/yank-74-75
Add yanked notices to 0.0.74 and 0.0.75 changelogs
2025-07-28 12:55:47 -07:00
Mark Backman
419cc9ac68 Add yanked notices to 0.0.74 and 0.0.75 changelogs 2025-07-28 13:36:02 -04:00
Ashot
83b4747196 chore: address review comments 2025-07-28 17:52:17 +04:00
Ashot
a13b954415 formatting/cleanup: address Copilot PR review comments 2025-07-28 17:43:17 +04:00
Ashot
f2e9562f1b feat(tts): integrate Async TTS engine into pipecat 2025-07-28 17:42:57 +04:00
Mark Backman
afed9a61f2 Merge pull request #2268 from pipecat-ai/mb/inworld-changelog
Add changelog entry for InworldTTSService
2025-07-28 05:59:57 -07:00
Mark Backman
f0de27b35e Merge pull request #2273 from pipecat-ai/mb/gitignore-plivo-stream
.gitignore: add plivo-chatbot streams.xml, .python-version
2025-07-28 05:59:41 -07:00
Filipi da Silva Fuchter
9d5510ee47 Merge pull request #2265 from pipecat-ai/filipi/small_webrtc_buffer_processor
Fixed an issue in AudioBufferProcessor when using SmallWebRTCTransport
2025-07-28 09:23:58 -03:00
Mark Backman
434c3fc527 Merge pull request #2279 from Allenmylath/patch-26
Update README.md
2025-07-28 04:51:58 -07:00
allenmylath
aba79a9478 Update README.md
Explanatory comment.Eventhought code allows for transport params.It is not clearly given in readme, what all are the options are there.
2025-07-28 11:24:27 +05:30
Mark Backman
fc96e091a9 Update NVIDIA in README 2025-07-26 15:01:52 -04:00
Mark Backman
851a27c082 Add Groq TTS to README 2025-07-26 14:58:07 -04:00
Mark Backman
a72d93dc6d Add Inworld to README 2025-07-26 14:55:19 -04:00
Mark Backman
c971232f20 .gitignore: add plivo-chatbot streams.xml, .python-version 2025-07-26 10:19:50 -04:00
Mark Backman
4b2ba2d69f Merge pull request #2270 from Allenmylath/patch-25
Update README.md
2025-07-26 04:55:59 -07:00
allenmylath
240a698fab Update README.md
recreating examples without activating bedrock models will create errors . Warning and link added
2025-07-26 14:51:19 +05:30
Mark Backman
9aaae01063 Add changelog entry for InworldTTSService 2025-07-25 21:46:02 -04:00
Mark Backman
41c8d22cf3 Merge pull request #2208 from padillamt/mtp/add-inworld-tts
Inworld HTTP TTS Service
2025-07-25 17:13:37 -07:00
padillamt
b68f044ef7 mtpadilla: updated example to reflect parameter placement changes in base Inworld TTS class 2025-07-25 15:13:43 -07:00
padillamt
e140bd6960 mtpadilla: moved model and voice id setting into the class constructor 2025-07-25 14:04:49 -07:00
Filipi Fuchter
e86b55e2b3 Fixed an issue in AudioBufferProcessor when using SmallWebRTCTransport where, if the microphone was muted, track timing was not respected. 2025-07-25 17:01:41 -03:00
padillamt
4a9bec5b35 mtpadilla: stop metrics at result chunk 2025-07-25 11:14:20 -07:00
padillamt
37361391d9 mtpadilla: removed ability to set base_url via constructor, set internally based on streaming variable 2025-07-25 09:16:56 -07:00
Filipi da Silva Fuchter
4b3726eba4 Merge pull request #2260 from pipecat-ai/filipi/audio_resampler
Fixed an issue in `AudioBufferProcessor` that caused garbled audio
2025-07-25 09:27:42 -03:00
padillamt
8e66794759 mtpadilla: switch to Deepgram ASR for lower latency 2025-07-24 22:22:36 -07:00
padillamt
acc5b9f210 inworld: change to function that stops all processing metrics
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 22:07:15 -07:00
padillamt
f982ace4c5 inworld: removal of unnecessary setting of ssampling rate since matches default 2025-07-24 21:56:01 -07:00
padillamt
5fb1899aeb inworld: removal of unnecessary default assignment as already handled 2025-07-24 21:42:42 -07:00
padillamt
7483422bd9 inworld: change set_voice uto use self._settings
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 21:23:03 -07:00
padillamt
16c20f3a99 inworld: removal of unnecessary default assignment since already done
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 21:15:34 -07:00
padillamt
d248c102c8 inworld: removal of unnecessary default assignment since already done
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-24 21:15:20 -07:00
padillamt
662550cc5e mtpadilla: remove unused imports 2025-07-24 21:05:22 -07:00
padillamt
067f64389b mtpadilla: no longer needed so making empty 2025-07-24 20:44:27 -07:00
padillamt
81048ce43a mtpadilla: rename 07aa-interruptible-inworld-http.py to 07ab-interruptible-inworld-http.py 2025-07-24 20:42:29 -07:00
padillamt
f6440ee6e1 mtpadilla: correct Examples header in comments 2025-07-24 13:36:40 -07:00
padillamt
da8c67114a mtpadilla: make streaming the default for example 2025-07-24 13:35:29 -07:00
Mark Backman
d8ea1311ff Merge pull request #2254 from pipecat-ai/mb/11labs-create-context-id
ElevenLabsTTSService: Only reset the context_id when interruptions ar…
2025-07-24 13:32:37 -07:00
Mark Backman
2be615066c Merge pull request #2261 from pipecat-ai/mb/foundational-requirements
Foundational requirements.txt: add silero, websocket optional dep, re…
2025-07-24 11:06:16 -07:00
Mark Backman
75c2ffc0b5 Check if audio context is already available, create one if not 2025-07-24 13:57:46 -04:00
Mark Backman
2297eb217e ElevenLabsTTSService: Only reset the context_id when interruptions are enabled 2025-07-24 13:53:44 -04:00
Mark Backman
1bb821a07d Foundational requirements.txt: add silero, websocket optional dep, remove fastapi 2025-07-24 13:49:44 -04:00
Filipi Fuchter
970b8044a0 Fixed an issue in AudioBufferProcessor that caused garbled audio when enable_turn_audio was enabled and audio resampling was required. 2025-07-24 13:25:48 -03:00
Filipi da Silva Fuchter
d8bcb81f35 Merge pull request #2259 from pipecat-ai/filipi/eleven_labs_delayed_messages
Play delayed messages from `ElevenLabsTTSService` if they still belong to the current context.
2025-07-24 12:07:06 -03:00
Filipi da Silva Fuchter
3ce0ab8c6d Removing extra space.
Co-authored-by: Mark Backman <mark@daily.co>
2025-07-24 12:05:17 -03:00
Filipi Fuchter
097d786431 Fixing ruff format. 2025-07-24 12:03:17 -03:00
Filipi Fuchter
662f04879c Play delayed messages from ElevenLabsTTSService if they still belong to the current context. 2025-07-24 12:00:14 -03:00
Mark Backman
7a69f57e11 Merge pull request #2255 from pipecat-ai/mb/pyproject-versions-for-uv
pyproject.toml dependency updates to support better cross compatibility
2025-07-24 06:43:35 -07:00
Mark Backman
5b7b4efdc9 Add broader version support for stable core dependencies, up to the next major version 2025-07-24 09:40:52 -04:00
Mark Backman
cfa26524ca Add support for fastapi>=0.115.6,<0.117.0 2025-07-24 09:37:42 -04:00
Mark Backman
3d4ab7158d pyproject.toml dependency updates to support better cross compatibility 2025-07-24 09:37:42 -04:00
Mark Backman
26d1ca3c98 Merge pull request #2256 from pipecat-ai/mb/refactor-neuphonic-http
NeuphonicHttpTTSService: Refactor to use POST API
2025-07-24 06:36:23 -07:00
Mark Backman
083b32887e NeuphonicHttpTTSService: Refactor to use POST API 2025-07-24 01:05:37 -04:00
padillamt
b6367965cb mtpadilla: consolidate streaming and non-streaming options into a single class with common API, with boolean switch variable added (streaming) 2025-07-23 16:50:32 -07:00
padillamt
147bf9cfe8 mtpadilla: addition of non-streaming option with own dedicated class, and related additional non-streaming test option 2025-07-23 15:28:43 -07:00
Mark Backman
3391929127 Merge pull request #2252 from pipecat-ai/mb/example-axios-version-bump
Update axios in daily-pstn-server example due to transitive vulnerabi…
2025-07-23 13:30:58 -07:00
padillamt
a5d353030e mtpadilla: small formatting fix to comments 2025-07-23 12:02:58 -07:00
padillamt
f29024bcc0 mtpadilla: update coments regarding temperature parameter 2025-07-23 11:47:26 -07:00
Mark Backman
ebf9bc2741 Merge pull request #2246 from ydlamba/ydlamba/missing-livekit-event
fix(livekit): emit on_audio_track_subscribed event
2025-07-23 11:27:10 -07:00
Mark Backman
f5edde42f6 Update axios in daily-pstn-server example due to transitive vulnerability with form-data 2025-07-23 14:22:13 -04:00
Filipi da Silva Fuchter
37bb7ef926 Merge pull request #2239 from pipecat-ai/filipi/daily_log
Added `set_log_level` to `DailyTransport`
2025-07-23 14:48:34 -03:00
Filipi Fuchter
a63d1530a4 Added set_log_level to DailyTransport. 2025-07-23 14:43:53 -03:00
Yash Dev Lamba
960bc9df5b chore(changelog): add entry for LiveKitTransport audio subscribed event fix 2025-07-23 22:41:20 +05:30
Mark Backman
e2a153ee01 Merge pull request #2242 from pipecat-ai/mb/websockets-14
Upgrade websockets to support asyncio implementation
2025-07-23 08:58:08 -07:00
Mark Backman
300f19ad23 Port to the websockets asyncio implementation, support for websockets 13 and 14 2025-07-23 11:54:25 -04:00
Mark Backman
7955080da2 Change extra_headers to additional_headers, update websocket version support 2025-07-23 11:53:43 -04:00
Mark Backman
994e82c1ef Merge pull request #2243 from pipecat-ai/mb/word-wrangler-twilio-readme
Update Word Wrangler phone bot README to include deployment info
2025-07-23 07:04:19 -07:00
Mark Backman
b07b947352 Merge pull request #2244 from pipecat-ai/mb/upgrade-deepgram-4.7.0
Deepgram: Update optional dependency to 4.7.0
2025-07-23 07:04:02 -07:00
Filipi da Silva Fuchter
a6527c3856 Merge pull request #2240 from pipecat-ai/filipi/sig_term
Adding support for handle_sigterm
2025-07-23 08:15:50 -03:00
antonyesk601
1cbf7ae480 fix: remove unused variable; fix: remove redundant logic 2025-07-23 08:26:44 +00:00
Yash Dev Lamba
0e6874b605 fix(livekit): emit on_audio_track_subscribed event 2025-07-23 08:23:45 +05:30
Mark Backman
9ba172c49f Merge pull request #2236 from dbtreasure/fix/python-311-compatibility
Fix Python 3.11+ compatibility by pinning numba/llvmlite versions
2025-07-22 18:20:38 -07:00
dbtreasure
f710c94b6e Address code review feedback: remove explicit llvmlite pin
- Remove explicit llvmlite>=0.44.0 pin as numba>=0.61.0 automatically pulls compatible version
- Add changelog entry for Python 3.11+ dependency fix

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-22 18:45:32 -06:00
dbtreasure
6e3a0a2d5d Add explicit numba/llvmlite pins for Python 3.11+ compatibility
Fixes dependency resolution issues where transitive dependencies
through resampy would install incompatible versions:
- numba>=0.61.0 (supports Python 3.10-3.13)
- llvmlite>=0.44.0 (supports Python 3.10-3.13)

Previously, older versions (numba 0.53.1, llvmlite 0.36.0) only
supported Python 3.6-3.9, causing deployment failures on Python 3.11+.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-22 18:45:02 -06:00
Mark Backman
9530b8b842 Merge pull request #2235 from pipecat-ai/mb/nltk-tokenizer
Update match_endofsentence to use NLTK sentence tokenizer
2025-07-22 17:22:23 -07:00
Mark Backman
26c937af87 Update match_endofsentence to use NLTK sentence tokenizer 2025-07-22 20:19:29 -04:00
Mark Backman
976f6168f0 Deepgram: Update optional dependency to 4.7.0 2025-07-22 20:15:30 -04:00
Mark Backman
0be64e0fd9 Update Word Wrangler phone bot README to include deployment info 2025-07-22 20:10:20 -04:00
Filipi Fuchter
7d527c3a6b Mentioning the new field in the changelog. 2025-07-22 19:32:52 -03:00
Filipi Fuchter
c6f6930c27 Adding support for handle_sigterm 2025-07-22 17:24:07 -03:00
Mark Backman
c33dfe8309 Merge pull request #2233 from pipecat-ai/mb/enable-tracing-flag
fix: enable_tracing PipelineParam controls the service class decorators
2025-07-22 08:14:32 -07:00
Mark Backman
769cd1ef06 fix: enable_tracing PipelineParam controls the service class decorators 2025-07-22 11:10:53 -04:00
Mark Backman
6d72f60571 Merge pull request #2234 from pipecat-ai/mb/fix-minimax-pitch
fix: MiniMaxHttpTTSService pitch, add base_url arg
2025-07-22 08:10:01 -07:00
Mark Backman
e8d0712ac1 Merge pull request #2238 from pipecat-ai/mb/patch-form-data
Fix form-data vulnerability in pipecat-cloud-daily-pstn-server
2025-07-22 08:09:49 -07:00
Mark Backman
88b2c817ac Fix form-data vulnerability in pipecat-cloud-daily-pstn-server 2025-07-22 10:08:25 -04:00
Mark Backman
f8f6c9918d Merge pull request #2237 from pipecat-ai/mb/pipecat-cloud-example-pipeline-runner-args
Update Pipecat Cloud example to use handle_sigint=False in PipelineRu…
2025-07-22 06:55:56 -07:00
Mark Backman
8ee608bbfe Update Pipecat Cloud example to use handle_sigint=False in PipelineRunner args 2025-07-22 09:52:57 -04:00
Mark Backman
fad2ba4570 Merge pull request #2204 from yousifa/mcp-FunctionCallParams 2025-07-22 05:01:32 -07:00
Mark Backman
f609f7eb53 fix: MiniMaxHttpTTSService pitch, add base_url arg 2025-07-21 21:16:35 -04:00
Mark Backman
ea09813a2b Merge pull request #2227 from pipecat-ai/mb/fix-11labs-wordtimestamps
fix: Improve ElevenLabsTTSService word/timestamp calcuation accuracy
2025-07-21 16:07:07 -07:00
Mark Backman
53abfc27a7 fix: Improve ElevenLabsTTSService word/timestamp calcuation accuracy 2025-07-21 18:48:38 -04:00
padillamt
1915407ff7 inworld: removed unreferenced is_first_chunk variable 2025-07-21 15:30:48 -07:00
Mark Backman
9c72e96a2c Merge pull request #2230 from pipecat-ai/mb/livekit-tenacity
Livekit: change tenacity supported versions
2025-07-21 15:28:38 -07:00
Mark Backman
f66c67c4ab Merge pull request #2232 from pipecat-ai/mb/fix-ollama-args
Fix: Ollama kwargs error
2025-07-21 15:26:13 -07:00
Mark Backman
b623face03 Add Ollama function calling example 14u 2025-07-21 17:52:16 -04:00
Mark Backman
698d60f3ae fix: OLLamaLLMService pass base_url as kwarg 2025-07-21 17:51:11 -04:00
Mark Backman
c9717a23a5 Livekit: change tenacity supported versions 2025-07-21 17:30:18 -04:00
padillamt
076a675a75 inworld: Fix...Set sample_rate=None in InworldHttpTTSService to match Cartesia pattern 2025-07-21 13:50:36 -07:00
padillamt
0d5292c4ef inworld: typo fix in voice name 2025-07-21 13:48:13 -07:00
padillamt
4853d5d55c inworld: updated InworldHttpTTSService initialization 2025-07-21 13:27:25 -07:00
padillamt
8eda2435a2 inworld: removed explicit references to language since our models currently infer that from the text. 2025-07-21 13:24:10 -07:00
Mark Backman
d981ce6e56 Merge pull request #2226 from pipecat-ai/mb/11labs-speed-docstring
Fix 11Labs speed docstring
2025-07-21 13:21:45 -07:00
padillamt
54ff946976 inworld: largely adjustments for docstring compatibility 2025-07-21 12:07:58 -07:00
Mark Backman
1bbd3bd8ab Fix 11Labs speed docstring 2025-07-21 14:58:12 -04:00
padillamt
aadd088b50 inworld: commented out contents as per Pipecat guidance that this pattern is being retired 2025-07-21 10:52:55 -07:00
padillamt
4250aa6616 inworld: removal of backup copy, no longer needed 2025-07-21 10:11:50 -07:00
Kwindla Hultman Kramer
a20915caa7 Merge pull request #2224 from pipecat-ai/khk/mps
Add MPS backend auto-detection to local smart-turn v2
2025-07-21 09:24:51 -07:00
Vanessa Pyne
28cab5a606 Merge pull request #1932 from getchannel/groundingMetadata
Add groundingMetadata to Gemini Multimodal Live Service
2025-07-21 10:09:26 -05:00
Vanessa Pyne
cfea56064d small merge-main nit fixes - gemini_multimodal_live events.py 2025-07-21 09:54:15 -05:00
Vanessa Pyne
8467d87cfc small main-merge fixes - gemini.py 2025-07-21 09:52:32 -05:00
Kwindla Hultman Kramer
b20d020bea Add MPS backend auto-detection to local smart-turn v2 2025-07-20 20:18:45 -04:00
padillamt
e3711f96a3 inworld: added detailed comments 2025-07-20 17:06:35 -07:00
Pete
948257c66e Merge branch 'main' into groundingMetadata 2025-07-20 19:54:30 -04:00
Pete
b54d1fb7fd Resolve merge conflict and remove duplicate File API initialization
- Remove duplicate file_api initialization lines
- Keep grounding metadata tracking functionality
- Maintain clean code structure
2025-07-20 19:15:40 -04:00
Pete
ec361df0d1 Fix final ruff linting issues
- Remove duplicate import in __init__.py
- Clean up extra blank lines in gemini.py
- Remove extra blank line in _create_single_response method
2025-07-20 18:58:54 -04:00
Pete
b1a5cddde4 Refactor whitespace and formatting in multiple files
- Clean up unnecessary whitespace in `gemini.py`, `events.py`, and `file_api.py`
- Ensure consistent formatting in `26g-gemini-multimodal-live-groundingMetadata.py`
- Improve readability by aligning code and removing trailing spaces
2025-07-20 18:40:12 -04:00
Pete
e165d38277 remove truncated logging from debug 2025-07-20 18:27:21 -04:00
Pete
8ba340a8a5 remove debug logging 2025-07-20 18:21:42 -04:00
Pete
8f74b97591 Refactor _send_user_text method in Gemini multimodal service to streamline event creation for turn completion 2025-07-20 18:08:45 -04:00
Pete
1d69cd1a5e Remove debug logging from _send_user_text method in Gemini multimodal service 2025-07-20 18:04:57 -04:00
Pete
bd7a0f27cc Add text input handling to Gemini multimodal service
- Updated `RealtimeInput` to include an optional `text` parameter.
- Introduced `TextInputMessage` class for encapsulating text input data.
- Implemented `_send_user_text` method to send text input to the Gemini Live API.
- Enhanced message processing to support text input alongside media chunks.
2025-07-20 17:39:31 -04:00
padillamt
5d8c184d99 inworld: commit of original text file and changes that copy openai's with Inworld TTS as only change 2025-07-18 16:30:03 -07:00
padillamt
1bc442e329 inworld: docstring fix 2025-07-18 15:13:19 -07:00
kompfner
d4e33663b2 Merge pull request #2214 from pipecat-ai/pk/fix-google-llm-context
Fixed an issue in `GoogleLLMContext` where it would inject the `syste…
2025-07-18 09:28:28 -04:00
marcus-daily
d7d1b16dad Removing old import 2025-07-18 12:48:06 +01:00
marcus-daily
0bc2ea13f2 Updating changelog 2025-07-18 12:48:06 +01:00
marcus-daily
b5d1301221 Fix linter warnings 2025-07-18 12:48:06 +01:00
marcus-daily
ed8f30ec71 Add support for running smart-turn-v2 locally 2025-07-18 12:48:06 +01:00
antonyesk601
688031efd6 fix: use undeclared variable _preinitialized. fix: double send of start frame 2025-07-18 08:23:04 +00:00
kompfner
a74a935ca0 Merge pull request #1910 from matejmarinko-soniox/main
Add Soniox STT service integration
2025-07-17 09:29:07 -04:00
antonyesk601
0f9e69d3c7 feat: Add Simli Trinity models support to pipecat 2025-07-17 11:55:40 +00:00
padillamt
f3984aec33 inworld: added (empty) requirements for Inworld to be explicit reg dependencies 2025-07-16 13:21:32 -07:00
Paul Kompfner
7cfd56699b Fixed an issue in GoogleLLMContext where it would inject the system_message as a "user" message into cases where it was not meant to; it was only meant to do that when there were no "regular" (non-function-call) messages in the context, to ensure that inference would run properly. 2025-07-16 16:07:53 -04:00
Matej Marinko
cb984237a7 Fix lint error 2025-07-16 16:54:28 +02:00
Matej Marinko
c969fdddb9 Rename and simplify VAD finalization parameter usage 2025-07-16 09:47:34 +02:00
padillamt
2b76823b01 inworld: added comments to track a few things to confirm 2025-07-15 18:17:30 -07:00
padillamt
ca936bd569 inworld: added Inworld to list of needed credentials 2025-07-15 18:11:50 -07:00
padillamt
c67b779b91 inworld: first commit of Inworld example file for TTS 2025-07-15 17:21:16 -07:00
padillamt
913dba3b74 inworld: class name change 2025-07-15 17:15:57 -07:00
padillamt
384838147a inworld: removed unnecessary code from stop() and cancel() 2025-07-15 16:56:18 -07:00
padillamt
7861b911c0 inworld: first commit of __init__ and tts.py files 2025-07-15 16:50:50 -07:00
Mark Backman
9931ad2ce1 Merge pull request #2199 from Dev-Khant/add-host-support-in-Mem0
Add `host` support in Mem0 Memory
2025-07-15 11:41:15 -07:00
Filipi da Silva Fuchter
fd73feb645 Merge pull request #2201 from pipecat-ai/filipi/stt_issue
Only create the EmulateUserStartedSpeakingFrame if we have received a transcription
2025-07-15 13:56:11 -03:00
Yousif Astarabadi
ee78428a2a formatted 2025-07-14 20:38:28 -07:00
Yousif Astarabadi
ae02249255 mcp_tool_wrapper using FunctionCallParams 2025-07-14 20:31:22 -07:00
Filipi Fuchter
727af2e6fb Only create the EmulateUserStartedSpeakingFrame if we have received a transcription. 2025-07-14 17:38:03 -03:00
Mark Backman
8fd5576879 Merge pull request #2198 from Allenmylath/patch-24
Update app.py
2025-07-14 06:37:42 -07:00
kompfner
1f85dcee7c Merge pull request #2171 from pipecat-ai/pk/aws-strands-demo
Minimal AWS Strands demo
2025-07-14 09:32:16 -04:00
Dev Khant
138890bc5c Add support in Mem0 Memory 2025-07-14 18:08:25 +05:30
Filipi da Silva Fuchter
a094efc9e6 Merge pull request #2196 from pipecat-ai/mb/lmnt-model
LmntTTSService: update the default model to blizzard
2025-07-14 09:15:17 -03:00
allenmylath
1f9e2fdecc Update app.py
misleading comment. no endpoints.py
2025-07-14 14:02:35 +05:30
Mark Backman
4a2b4660bc LmntTTSService: update the default model to blizzard 2025-07-13 10:54:43 -07:00
Mark Backman
b3ac90015a Merge pull request #2195 from Trinary-Projects/transformers_ver_patch
Update transformers dep. to >=4.48.0 for Ultravox
2025-07-11 23:31:47 -07:00
Jaideep
2fe06f0a4e Update pyproject.toml 2025-07-12 11:34:45 +05:30
Mark Backman
1836a7484e Merge pull request #2193 from pipecat-ai/mb/changelog-0.0.76
Prepare changelog for 0.0.76 release
2025-07-11 16:15:34 -07:00
Mark Backman
25a5c5aaab Prepare changelog for 0.0.76 release 2025-07-11 16:08:08 -07:00
mattie ruth backman
24694e2558 Changelog entry 2025-07-11 14:30:12 -07:00
mattie ruth backman
2325edd9ba Add a text entry box to the simple-chatbot example 2025-07-11 14:30:12 -07:00
mattie ruth backman
fad5713ade Fix append-to-context function call 2025-07-11 14:30:12 -07:00
Paul Kompfner
fe8573322f AWS Strands demos 2025-07-11 16:42:27 -04:00
Mark Backman
06c1255abe fix: use a different aggregation timeout for emulated user speech (#2185)
* fix: use a different aggregation timeout for emulated user speech

* Add SpeechControlParamsFrame

* Update test_context_aggregator tests
2025-07-11 16:33:44 -04:00
Mark Backman
f108a67635 Merge pull request #2189 from pipecat-ai/mb/numpy-version-bump
Update numpy, transformers to support newer versions
2025-07-11 12:02:02 -07:00
Mark Backman
bf580d061d Update numpy, transformers to support newer versions 2025-07-11 11:58:31 -07:00
Filipi da Silva Fuchter
b005bd7b98 Merge pull request #2184 from pipecat-ai/filipi/twilio_issue
Fixing an issue where Pipecat was not receiving the user's audio
2025-07-11 15:32:28 -03:00
Filipi Fuchter
75f8baab33 Mentioning the fixes in the changelog. 2025-07-11 11:56:16 -03:00
Matej Marinko
5c3fb73cef Rename example 2025-07-11 16:07:24 +02:00
Filipi Fuchter
5c3f4180b9 Refactored VAD analyzer to process multiple audio frames in a single iteration if needed. 2025-07-11 10:59:32 -03:00
Mark Backman
6cd6e7ceed Merge pull request #2186 from pipecat-ai/mb/fix-pre-commit-config
Update .pre-commit-config.yaml to use pyproject.toml linting rules
2025-07-11 06:34:01 -07:00
Filipi Fuchter
1a146c2a64 Not serializing a JSON in case we have no audio. 2025-07-11 10:15:09 -03:00
Filipi Fuchter
eaeb9e6efa Not creating InputAudioRawFrame in case we don't have bytes. Fixed for Pilvo, Exotel and Telnyx. 2025-07-11 09:51:38 -03:00
Matej Marinko
2e84c91748 Remove outdated parameter 2025-07-11 08:52:39 +02:00
Matej Marinko
650d45c1f4 Use single sample rate parameter 2025-07-11 08:27:06 +02:00
Filipi Fuchter
f4f65024ef Refactoring the test client to use the new version of the Pipecat Client SDK. 2025-07-10 21:57:25 -03:00
Filipi Fuchter
1200aa4fb8 Not creating InputAudioRawFrame in case we don't have bytes. 2025-07-10 21:56:34 -03:00
Filipi da Silva Fuchter
6762363685 Merge pull request #2183 from pipecat-ai/filipi/parallel_pipeline_issue
Fixed an issue in ParallelPipeline that caused errors when attempting to drain the queues.
2025-07-10 21:51:04 -03:00
Filipi Fuchter
b2ead325c4 Fixed an issue in ParallelPipeline that caused errors when attempting to drain the queues. 2025-07-10 21:50:35 -03:00
Mark Backman
4e24b915cc Update .pre-commit-config.yaml to use pyproject.toml linting rules 2025-07-10 16:10:27 -07:00
kompfner
b610ee26ba Merge pull request #2181 from pipecat-ai/pk/fix-aws-nova-sonic-pipeline-freeze
Fix a pipeline freeze when using AWS Nova Sonic. The freeze occurs if…
2025-07-10 16:30:55 -04:00
Paul Kompfner
2b867f1613 Fix a pipeline freeze when using AWS Nova Sonic. The freeze occurs if the user starts speaking before we've finished sending the "trigger " audio (AWS Nova Sonic can only start speaking in response to a user utterance, so we have a simulated user utterance to "trigger" the bot speaking without the user having actually spoken first). 2025-07-10 15:57:05 -04:00
Mark Backman
7b8fe565c7 Merge pull request #2182 from pipecat-ai/mb/run-example-usage
run.py: Add example usage to the module docstring
2025-07-10 12:48:29 -07:00
Mark Backman
a246862910 run.py: Add example usage to the module docstring 2025-07-10 11:41:49 -07:00
Filipi da Silva Fuchter
106809f3fd Merge pull request #2166 from carolin-tavus/remove-persona-microphone-check
feat: Remove persona microphone check
2025-07-10 15:28:35 -03:00
carolin-tavus
f0d8499f7e feat: avoid checking microphone enabled 2025-07-10 09:40:27 +00:00
Mark Backman
332ca3d55e Merge pull request #2177 from pipecat-ai/mb/fix-ruff-improvements
Make fix-ruff.sh more flexible, use pyproject rules
2025-07-09 12:33:05 -07:00
Mark Backman
a48f5d5796 Make fix-ruff.sh more flexible, use pyproject rules 2025-07-09 11:48:17 -07:00
Mark Backman
f04f047428 Merge pull request #2176 from pipecat-ai/mb/pre-commit-config
Add docstring checking to .pre-commit-config.yaml
2025-07-09 11:47:25 -07:00
Mark Backman
4e61fd33ea Add docstring checking to .pre-commit-config.yaml 2025-07-09 11:18:40 -07:00
Matej Marinko
61ac77be72 Update docs 2025-07-09 11:59:45 +02:00
Matej Marinko
c093eb5b63 Move config to main file 2025-07-09 10:20:37 +02:00
Matej Marinko
98e24131bd Send raw result 2025-07-09 09:59:04 +02:00
Matej Marinko
7becce9e8c Add transcript tracing 2025-07-09 09:37:58 +02:00
Matej Marinko
3cdaeb719a Update examples to new format 2025-07-09 09:28:43 +02:00
Matej Marinko
8daaea5969 Minor code cleanup 2025-07-09 09:03:02 +02:00
matejmarinko-soniox
dc47516e14 Update src/pipecat/services/soniox/config.py
Co-authored-by: Mark Backman <m.backman@gmail.com>
2025-07-09 08:04:59 +02:00
Mark Backman
0fcc4f822f Merge pull request #2173 from captaincaius/fix-nextjs-webhook-example-null-check
fix nextjs webhook example num_endpoints null check
2025-07-08 14:10:16 -07:00
Captain Caius
c0ed061ff5 fix nextjs webhook example num_endpoints null check 2025-07-08 13:40:26 -07:00
Mark Backman
d98b6b418d Release prep for 0.0.75 (#2169)
* Update CHANGELOG for 0.0.75 release

* Add RTVI changes to CHANGELOG

* Update CHANGELOG, add deprecation directives to rtvi.py

---------

Co-authored-by: mattie ruth backman <mattieruth@gmail.com>
2025-07-08 15:39:44 -04:00
Mark Backman
deea29b5e8 Merge pull request #2170 from pipecat-ai/mb/update-packages-0.0.75
Package updates to run the release evals
2025-07-08 12:02:22 -07:00
Mark Backman
0bdbc83ed9 Package updates to run the release evals 2025-07-08 11:39:49 -07:00
Mark Backman
6c591f0990 Merge pull request #2167 from pipecat-ai/mb/fix-riva-watchdog
RivaSTTService: reset the watchdog in an async function
2025-07-08 11:29:44 -07:00
Mark Backman
b55b9c257b RivaSTTService: remove reset_watchdog, which is handled in the WatchdogQueue already 2025-07-08 11:19:47 -07:00
Mark Backman
5156c21d14 Merge pull request #2168 from pipecat-ai/mb/fix-neuophonic-tts
Fix: NeuphonicTTSService to use latest websocket API
2025-07-08 11:17:58 -07:00
Mark Backman
a9d824753b Fix: NeuphonicTTSService to use latest websocket API 2025-07-08 11:08:08 -07:00
Filipi da Silva Fuchter
3c6a208101 Merge pull request #2148 from pipecat-ai/filipi/aws_bedrock
Refactoring AWSBedrockLLMService to work async
2025-07-08 12:14:28 -03:00
Mark Backman
b1032a1ca4 Merge pull request #2151 from pipecat-ai/mb/ollama-kwargs
OLLamaLLMService: Pass kwargs
2025-07-08 07:21:09 -07:00
Mark Backman
931f34fccd OLLamaLLMService: Pass kwargs 2025-07-08 07:17:18 -07:00
Mark Backman
f2509adec1 Merge pull request #2162 from pipecat-ai/mb/tts-service-aggregate-sentences
TTS services: Add aggregate_sentences arg
2025-07-08 07:14:58 -07:00
Filipi Fuchter
285b82eb65 Mentioning the AWSBedrockLLMService and AWSPollyTTSService refactors in the changelog. 2025-07-08 07:30:30 -03:00
Filipi Fuchter
74da197304 Refactored AWSBedrockLLMService and AWSPollyTTSService to work asynchronously using aioboto3 instead of the boto3 library. 2025-07-08 07:28:23 -03:00
Matej Marinko
0f727248d2 Merge branch 'main' of github.com:pipecat-ai/pipecat 2025-07-08 08:20:10 +02:00
mattie ruth backman
a6de16f92f Bump all client dependencies to use client-js/react/transports 1.0.0 2025-07-07 15:56:08 -07:00
mattie ruth backman
fc09854d7f fix cam light always on 2025-07-07 15:56:08 -07:00
mattie ruth backman
2959029151 PR Review fixes 2025-07-07 15:56:08 -07:00
mattie ruth backman
e590441b7b Add support for about info in ready messages and add deprecation comments to deprecated types 2025-07-07 15:56:08 -07:00
mattie ruth backman
dc41ec7cb1 Updated all examples with clients to use the new PipecatClient 2025-07-07 15:56:08 -07:00
mattie ruth backman
43049c865c Add support for new RTVI client message protocol: handling and responding 2025-07-07 15:56:08 -07:00
mattie ruth backman
c4a9fc7f88 video-transport typescript formatting 2025-07-07 15:56:08 -07:00
mattie ruth backman
faf4026cf4 Add device controls to the simple chatbot example 2025-07-07 15:56:08 -07:00
Mark Backman
f53f45a6cd TTS services: Add aggregate_sentences arg 2025-07-07 15:38:31 -07:00
Mark Backman
e04e876f44 Merge pull request #2156 from shahrukhx01/add-additional-whisper-models
WhisperSTTService: Add additional whisper model variants
2025-07-07 12:53:06 -07:00
shahrukhx01
a84e7e30da WhisperSTTService: Add additional whisper model variants 2025-07-07 21:43:48 +02:00
Mark Backman
6eed6ff779 Merge pull request #2147 from pipecat-ai/mb/user-idle-long-function-call
UserIdleProcessor: Account for function calls in progress
2025-07-04 14:11:16 -07:00
Mark Backman
1375211610 UserIdleProcessor: Account for function calls in progress 2025-07-04 14:05:05 -07:00
Mark Backman
4e9369a702 Merge pull request #2149 from pipecat-ai/mb/twilio-hang-up-handling 2025-07-04 12:44:17 -07:00
Mark Backman
f9e8748a96 TwilioFrameSerializer: Handle user hanging up before the serializer 2025-07-04 09:42:16 -07:00
Filipi da Silva Fuchter
20d6bf267a Merge pull request #2146 from pipecat-ai/remove_gemini_duplicated_code
Removing duplicated code inside Gemini.
2025-07-04 11:59:10 -03:00
Filipi Fuchter
b573f9dab2 Removing duplicated code inside Gemini. 2025-07-04 10:57:53 -03:00
Pete
7ed4fe50d4 Update gemini.py
-FunctionCallFromLLM
-Delete duplicate Gemini imports
2025-07-03 19:39:44 -04:00
Pete
6f66ec1727 Update gemini.py
tab indentation fix
2025-07-03 18:55:21 -04:00
Pete
c7e758fc36 Merge branch 'main' into groundingMetadata 2025-07-03 18:47:47 -04:00
Pete
14c22234bb Fix parameter name consistency in parse_server_event function
- Change function body to use 'str' parameter consistently
- Matches pattern used in OpenAI Realtime Beta service
- Fixes bug where parameter was named 'str' but body used 'message_str'
- Maintains consistency with existing codebase patterns
2025-07-03 18:02:24 -04:00
Pete
d565e9ae53 Update grounding metadata example with final refinements
- Reorganize imports and transport_params structure
- Remove copyright header for consistency
- Enhance grounding metadata logging with better formatting
- Remove unnecessary PipelineParams configuration
- Update message content formatting

Completes incorporation of draft PR #2121 changes
2025-07-03 17:53:55 -04:00
Pete
4951c97eab Clean up verbose logging in grounding metadata implementation
- Remove debug logging from grounding metadata event handlers
- Simplify logging in _process_grounding_metadata method
- Clean up example file logging for better readability
- Remove verbose event parsing comments

Based on suggestions from draft PR #2121
2025-07-03 17:49:27 -04:00
Pete
9b38f3e2fa Delete examples/foundational/26f-gemini-multimodal-live-files-api.py 2025-07-03 17:15:18 -04:00
Mark Backman
dbc76389d8 Merge pull request #2140 from pipecat-ai/mb/fix-26-imports
Fix: missing import in 26f foundational example
2025-07-03 14:12:54 -07:00
Aleix Conchillo Flaqué
c27f838444 Merge pull request #2124 from pipecat-ai/aleix/frame-processor-no-push-queue
FrameProcessor: remove unnecessary push task
2025-07-03 14:03:05 -07:00
Aleix Conchillo Flaqué
ce84485e26 Merge pull request #2142 from pipecat-ai/aleix/publish-workflow-message
github: update publish message to make it clear
2025-07-03 14:02:51 -07:00
Mark Backman
6cf254e2f9 Fix: missing import in 26f foundational example, update twilio transport_params to FastAPIWebsocketParams 2025-07-03 13:58:18 -07:00
Aleix Conchillo Flaqué
02b63c28a5 FrameProcessor: remove unnecessary push task
When we call `FrameProcessor.push_frame()` we end up calling
`FrameProcessor.queue_frame()` on the next or previous processor which already
uses the input queue and guarantees frame ordering. So, there's no need to have
a two queues next to each other.
2025-07-03 13:57:32 -07:00
Aleix Conchillo Flaqué
57c6ce7ffa github: update publish message to make it clear 2025-07-03 13:55:02 -07:00
Aleix Conchillo Flaqué
2f3272ea2f Merge pull request #2135 from pipecat-ai/aleix/pipecat-0.0.74
update CHANGELOG for 0.0.74
2025-07-03 13:46:00 -07:00
Aleix Conchillo Flaqué
f5c2d57e4b update CHANGELOG for 0.0.74 2025-07-03 13:44:21 -07:00
Aleix Conchillo Flaqué
baa878272d scripts(evals): added 07a-interruptible-speechmatics.py 2025-07-03 13:44:21 -07:00
Aleix Conchillo Flaqué
093285868e scripts(evals): update timeout back to 90 seconds 2025-07-03 13:37:17 -07:00
Filipi da Silva Fuchter
6c9d058ec2 Merge pull request #2139 from pipecat-ai/filipi/changelog_improvements
Mentioning the SpeechmaticsSTTService in the changelog.
2025-07-03 17:36:55 -03:00
Filipi Fuchter
5df7be6892 Mentioning the SpeechmaticsSTTService in the changelog. 2025-07-03 17:35:30 -03:00
Mark Backman
2deca816ae Merge pull request #2137 from pipecat-ai/mb/fish-audio-normalize
FishAudioTTSService: arg cleanup, add new InputParam and arg
2025-07-03 13:29:14 -07:00
Mark Backman
b8d2fceced Merge pull request #2138 from pipecat-ai/mb/fix-google-llm-import-order
GoogleLLMService: Linting fixes
2025-07-03 13:26:32 -07:00
Sam Sykes
7596d71460 Speechmatics STT + multi-speaker conversations (#2036)
* initial config

* skeleton

* Added a README (to be added to).

* Payloads coming from the ASR.

* doc update

* handle the partials and finals

* enable diarization in the example

* support sending messages to pipecat pipeline

* requirements fix in README

* updated example (with amusement)

* updated example to match master

* updated docs

* support for diarization tags

* logic fix for wrapper

* Use an internal SpeechFrame for speaker_id (not user_id).

* only include speaker tags on finalised transcript (as this may skew end of utterance detection)

* updated docs

* correction to docs and updated example

* updated requirement

* Fix for using default EU server.

* Updates from PR comments.

* Refactor based on comments in the original PR.

Primary focus on documentation, naming conventions and how `user_id` is used.

* Check for SMX installed when importing.

* Variable name change

* Comment correction.

* Support for Esporanto and Uyghur

* Impoved language support

* function name change

* Locale fix

* intercept

* interim changes

* pass the pipeline task to the module for adding events to the top of the pipeline

* logging for the pipeline

* Reduce timeout for content aggregator.

* staged update

* testing with Azure

* Updated context (Azure was dropping punctuation) and using better ElevenLabs model.

* Updated to RT 0.3.0 and use OpenAI (not Azure).

* Missing OpenAI import; parameter name change for output locale validation.

* Revert to `0.2.0` of RT SDK.

* fix for assignment of `output_locale_code`.

* update Speechmatics library to 0.3.1

* new transcription example

* updated asyncio task handling

* Updated doc strings

* enable OpenTelemetry logging

* removed import from stt for __init__

* updated examples and default values

* updated examples

* prevent lock up when closing the STT connection
2025-07-03 17:25:13 -03:00
Mark Backman
096067b097 GoogleLLMService: Linting fixes 2025-07-03 13:23:13 -07:00
Mark Backman
ec09505f6b FishAudioTTSService: Add normalize as InputParam, model_id as arg 2025-07-03 13:14:15 -07:00
Mark Backman
251ea756c8 FishTTSService: deprecate model, add reference_id 2025-07-03 12:56:24 -07:00
Aleix Conchillo Flaqué
8f6544efe2 Merge pull request #2133 from pipecat-ai/vp-changelog-fileapi
docs: add changelog line for gemini files api
2025-07-03 11:13:02 -07:00
otaqwawi
6045a8ad8c Add option to change the base URL for Google Generative AI. (#2113)
* Add option to change the base URL for Google Generative AI.
This would be useful to support private instance or gateway of the API

* fix: add proper type hints for http_options in Google LLM service
2025-07-03 11:12:35 -07:00
Aleix Conchillo Flaqué
b184d62634 Merge pull request #2134 from pipecat-ai/aleix/evals-cancel-expired-tasks
cancel expire evals tasks
2025-07-03 10:07:27 -07:00
Aleix Conchillo Flaqué
1a8d512abb scripts(evals): make sure we cancel pending tasks after timeout 2025-07-03 10:01:42 -07:00
vipyne
a62be8ea32 docs: add changelog line for gemini files api 2025-07-03 11:44:34 -05:00
Mark Backman
c230d94ff0 Merge pull request #2125 from pipecat-ai/mb/deprecate-handle-function-call-start
Add docs deprecation for handle_function_call_start
2025-07-03 12:27:17 -04:00
Aleix Conchillo Flaqué
e7b02773f5 Merge pull request #2131 from pipecat-ai/aleix/dtmf-aggregator-dangling-tasks
DtmfAggregator: cancel interruption task to avoid a dangling task
2025-07-03 08:34:50 -07:00
Aleix Conchillo Flaqué
ed83248a6b Merge pull request #2130 from pipecat-ai/aleix/pipeline-task-cancel-queue
PipelineTask: cancel idle queue before cancelling task
2025-07-03 08:32:31 -07:00
Aleix Conchillo Flaqué
af8b4901d4 DtmfAggregator: cancel interruption task to avoid a dangling task 2025-07-03 08:18:48 -07:00
Aleix Conchillo Flaqué
64c8230960 PipelineTask: cancel idle queue before cancelling task 2025-07-03 08:18:21 -07:00
Aleix Conchillo Flaqué
bf664534cc PipelineTask: cancel idle queue before cancelling task 2025-07-03 08:15:31 -07:00
Filipi da Silva Fuchter
274a04e535 Merge pull request #2129 from carolin-tavus/carolin-tavus/add-persona-validation
Add persona validation (check that microphone is enabled)
2025-07-03 11:49:42 -03:00
carolin-tavus
cb81f3d50e format 2025-07-03 14:38:20 +00:00
carolin-tavus
30a3b24287 Add persona validation (check that microphone is enabled) 2025-07-03 14:04:04 +00:00
Filipi da Silva Fuchter
8aacf71956 Merge pull request #1623 from phamtrung0633/victor/azure-tts-interruption-fix
Azure TTS fixed by clearing the audio queue before synthesizing the next text
2025-07-03 10:51:54 -03:00
Victor
72d503d3a3 Azure TTS fixed by clearing the audio queue before synthesizing the next text 2025-07-03 10:48:26 -03:00
Aleix Conchillo Flaqué
453a904290 Merge pull request #2123 from pipecat-ai/aleix/dev-requirements-25-07-02
update dev-requirements (dependabot)
2025-07-02 23:00:40 -07:00
Mark Backman
368bff4fb4 Merge pull request #2101 from pipecat-ai/mb/fix-websocket-example-dir
fix: remove javascript directory from the websocket README
2025-07-02 22:55:47 -04:00
Mark Backman
4ae045d704 Add docs deprecation for handle_function_call_start 2025-07-02 19:53:48 -07:00
Mark Backman
8c71939425 Merge pull request #2122 from pipecat-ai/mb/deprecation-docstrings
Add deprecation directives, add indexing, only autodoc members
2025-07-02 21:31:02 -04:00
Aleix Conchillo Flaqué
a437c2d365 update examples (dependabot) 2025-07-02 16:33:24 -07:00
Aleix Conchillo Flaqué
a1784e3237 update dev-requirements (dependabot) 2025-07-02 16:09:13 -07:00
Mark Backman
abee0f853c Add deprecation directives, add indexing, only autodoc members 2025-07-02 15:44:02 -07:00
Aleix Conchillo Flaqué
e9d358ed17 Merge pull request #2119 from pipecat-ai/aleix/llm-messages-append-update-run-llm
add run_llm to LLMMessagesAppendFrame and LLMMessagesUpdateFrame
2025-07-02 13:53:36 -07:00
Aleix Conchillo Flaqué
c5d54d06bb add run_llm to LLMMessagesAppendFrame and LLMMessagesUpdateFrame 2025-07-02 13:53:13 -07:00
Filipi da Silva Fuchter
c16eed7ca2 Merge pull request #2091 from pipecat-ai/filipi/sample_rate
Creating a new stream resampler which avoids clicks.
2025-07-02 16:22:46 -03:00
Filipi Fuchter
76388a10b5 Deprecating the create_default_resampler and adding the changelog. 2025-07-02 16:20:58 -03:00
Filipi Fuchter
38bcc033a2 Improving the docs about when to use: SOXRAudioResampler x SOXRStreamAudioResampler 2025-07-02 16:20:48 -03:00
Filipi Fuchter
5af563cd91 Configured the services to use create_stream_resampler instead of create_default_resampler 2025-07-02 16:20:34 -03:00
Filipi Fuchter
3de271161c Fixing the ruff script to also try to fix docstrings. 2025-07-02 16:19:57 -03:00
Filipi Fuchter
c19f9bc43a Creating a new stream resampler which avoids clicks. 2025-07-02 16:19:47 -03:00
Mark Backman
ef85d245ed Merge pull request #2120 from haayhappen/patch-1
Update README.md
2025-07-02 15:18:28 -04:00
Fynn Merlevede
25749bd4c0 Update README.md
fix: use correct protocol in READme
2025-07-02 20:57:38 +02:00
Mark Backman
e19c5464fe Merge pull request #2114 from pipecat-ai/mb/bump-google-genai-version
Upgrade google-genai version to 1.24.0
2025-07-02 14:25:29 -04:00
Mark Backman
5c2ea3b804 Upgrade google-genai version to 1.24.0 2025-07-02 11:18:37 -07:00
Aleix Conchillo Flaqué
c27348d470 Merge pull request #2118 from pipecat-ai/aleix/daily-python-0.19.4
pyproject: update daily-python to 0.19.4
2025-07-02 10:38:54 -07:00
Aleix Conchillo Flaqué
de5f9c9217 pyproject: update daily-python to 0.19.4 2025-07-02 09:51:36 -07:00
Aleix Conchillo Flaqué
f9086ee3a2 Merge pull request #2110 from pipecat-ai/aleix/daily-add-virtual-speaker-support
DailyTransport: allow receiving audio in a single track
2025-07-02 09:50:02 -07:00
Vanessa Pyne
43298a9026 Merge pull request #2077 from yousifa/mcp-http-gemini-support
Mcp http gemini support
2025-07-02 11:47:25 -05:00
Vanessa Pyne
d80e228c6f Merge branch 'main' into mcp-http-gemini-support 2025-07-02 11:47:18 -05:00
Mark Backman
2902362886 Merge pull request #2115 from pipecat-ai/mb/docstring-cleanup
Docstring cleanup, fix missing examples imports
2025-07-02 11:35:11 -04:00
Mark Backman
1cd303ad7f Merge pull request #2090 from pipecat-ai/mb/silero-np-error
Remove redundant import and global in SileroOnnxModel
2025-07-02 11:28:11 -04:00
Mark Backman
f590a476e7 Gemini Live fixes, plus additional docstrings 2025-07-02 08:27:23 -07:00
Mark Backman
e71cb3ba68 Docstring cleanup, fix missing examples imports 2025-07-02 08:27:23 -07:00
Filipi da Silva Fuchter
510a9af2e5 Merge pull request #2116 from pipecat-ai/filipi/fix_ios_chatbot_demo
Fixed an issue to disconnect the iOS chatbot demo.
2025-07-02 12:13:51 -03:00
Filipi Fuchter
5328f84df4 Fixed an issue to disconnect the iOS chatbot demo. 2025-07-02 12:06:15 -03:00
Yousif Astarabadi
18817fd81b added docstring in public GeminiFileAPI module 2025-07-02 00:09:48 -07:00
Yousif Astarabadi
4bcc536fd2 added arg description in docstring for gemini live init 2025-07-02 00:03:27 -07:00
Yousif Astarabadi
1ab2ddd317 fix lint error 2025-07-01 23:55:34 -07:00
Yousif
09aa168840 Merge branch 'pipecat-ai:main' into mcp-http-gemini-support 2025-07-01 23:54:42 -07:00
Vanessa Pyne
05753fb207 Merge pull request #1786 from getchannel/main
Add File API to GeminiMultimodalLive
2025-07-01 20:29:12 -05:00
Pete
715e3f8543 Merge branch 'pipecat-ai:main' into main 2025-07-01 20:42:28 -04:00
Pete
9c9d4b35a4 remove audio_transcriber from gemini.py
unecessary import removed.
2025-07-01 20:36:54 -04:00
getchannel
2ee935f784 Update gemini.py 2025-07-01 20:31:58 -04:00
Aleix Conchillo Flaqué
58aedc88a4 DailyTransport: allow receiving audio in a single track 2025-07-01 17:29:10 -07:00
getchannel
0e60385871 add FileAPI to gemini.py 2025-07-01 20:14:31 -04:00
Mark Backman
a4188f7986 Merge pull request #2103 from pipecat-ai/mb/add-user-id-to-transcript
Add user_id to transcription frames
2025-07-01 18:28:12 -04:00
vipyne
c7cbfe7a4f remove grounding metadata commits 2025-07-01 17:21:38 -05:00
vipyne
f1c9f5040b Update examples/foundational/26f-gemini-multimodal-live-files-api.py 2025-07-01 16:27:25 -05:00
vipyne
79e51051c7 New lint rules and remove unused example file 2025-07-01 16:27:25 -05:00
Pete
a63d0da528 Update gemini.py 2025-07-01 16:27:25 -05:00
getchannel
4fd8df208f Add groundingMetadata events.py 2025-07-01 16:27:25 -05:00
getchannel
44d3bd30fa Add groundingMetadata and logging gemini.py 2025-07-01 16:27:25 -05:00
getchannel
6e6e932370 Create 26g-gemini-multimodal-live-groundingMetadata.py 2025-07-01 16:27:25 -05:00
getchannel
baccf50417 update correct upload endpoint file_api.py 2025-07-01 16:27:25 -05:00
getchannel
7b1071b30d Create 26f-gemini-multimodal-live-files-api.py
This is an example to test usage of the Files API integration. Specifically with the Gemini Multimodal Live Service.
2025-07-01 16:27:25 -05:00
getchannel
bd7ca94196 Update gemini.py 2025-07-01 16:27:25 -05:00
getchannel
1ec1aa76e9 Rename file_api to file_api.py
added proper .py to file name.
2025-07-01 16:27:25 -05:00
getchannel
77c369c3c7 add file_api __init__.py 2025-07-01 16:27:25 -05:00
getchannel
9171d4b040 add FileData class events.py 2025-07-01 16:27:25 -05:00
getchannel
e02b95fca5 Create file_api 2025-07-01 16:27:25 -05:00
getchannel
d45a07b5e5 add FileAPI to gemini.py 2025-07-01 16:27:25 -05:00
Mark Backman
0cdcfcee8d Remove redundant import and global in SileroOnnxModel 2025-07-01 13:29:47 -07:00
Mark Backman
324546b4e7 Merge pull request #2098 from StrongMind/aws-session-token
Add support for session token in AWS Nova Sonic service
2025-07-01 16:25:38 -04:00
Filipi da Silva Fuchter
c8ee67a636 Merge pull request #2085 from pipecat-ai/filipi/freeze-test-python-3.10
Fixing pipeline freeze when using Python 3.10
2025-07-01 17:17:38 -03:00
Filipi Fuchter
b87c57c951 Adding missing docstring to the watchdog event 2025-07-01 17:12:18 -03:00
Filipi Fuchter
721f662bbe Making cancel sentinel classes private 2025-07-01 17:09:05 -03:00
Filipi Fuchter
fccd48bfff Fixing pipeline freeze when using Python 3.10 2025-07-01 17:05:18 -03:00
Filipi Fuchter
5310d903ec Adding the requirements and needed variables for the freeze-test example. 2025-07-01 17:04:27 -03:00
Mark Backman
8cbce555e4 Add user_id to stt_traced decorator 2025-07-01 13:01:48 -07:00
Mark Backman
f6112713e8 Add user_id to TranscriptionFrame and InterimTranscriptionFrame pushed by STTServices 2025-07-01 12:59:20 -07:00
Mark Backman
cc637f4dea Clean up docstrings after DirectFunction merge (#2105)
* Add missing import for FunctionCallParams

* Update docstrings in direct_function

* Docstring fixes for run.py

* Remove unused imports in llm_service

* Add missing docstrings to llm_service

* Remove FunctionCallParams import

* Wording improvements

* Type checking for FunctionCallParams
2025-07-01 15:22:30 -04:00
kompfner
7f76a14c54 Merge pull request #2104 from pipecat-ai/pk/changelog-fix
Whoops—fix mistake in CHANGELOG (`FlowsFunctionSchema` -> `FunctionSc…
2025-07-01 15:06:14 -04:00
Yousif Astarabadi
58675f4d5a renamed clean schema to alternate schema 2025-07-01 11:50:12 -07:00
Paul Kompfner
d50e6db312 Whoops—fix mistake in CHANGELOG (FlowsFunctionSchema -> FunctionSchema) 2025-07-01 14:24:27 -04:00
kompfner
de74284a8e Merge pull request #2051 from pipecat-ai/pk/direct-functions
Implement "direct functions", which allow you to bypass specifying a …
2025-07-01 14:19:33 -04:00
Aleix Conchillo Flaqué
4c9a295b28 Merge pull request #2095 from pipecat-ai/aleix/examples-smallwebrtc-sdp-munging
examples: add --esp32 for SDP munging if host name specified
2025-07-01 09:07:42 -07:00
Mark Backman
0968f36d3e fix: remove javascript directory from the websocket README 2025-07-01 09:51:02 -04:00
Mark Backman
fd570b0377 Update the remaining docstrings, update pre-commit hook, add docstring formatting CI, update CONTRIBUTING with formatting guidance (#2089) 2025-07-01 00:37:04 -04:00
Paul Shippy
68ea5ee570 Add to change log 2025-06-30 17:39:42 -07:00
Paul Shippy
f891140a74 Update sample to take in session token 2025-06-30 17:35:50 -07:00
Paul Shippy
5ed2d7ac2b Add session token option for AWS 2025-06-30 17:31:31 -07:00
Pete
a297e4208e Merge branch 'main' into groundingMetadata 2025-06-30 19:48:55 -04:00
Aleix Conchillo Flaqué
b713527da0 examples: add --esp32 for SDP munging if host name specified 2025-06-30 13:27:52 -07:00
Kwindla Hultman Kramer
224d2cedc8 Merge pull request #2088 from pipecat-ai/khk/gemini-thinking-default
Turn off thinking for Gemini models by default
2025-06-30 10:32:54 -07:00
Kwindla Hultman Kramer
55cfea776f Merge branch 'main' into khk/gemini-thinking-default 2025-06-30 10:32:42 -07:00
Paul Kompfner
d7a2078e0b Added CHANGELOG entry describing "direct" functions 2025-06-30 10:59:36 -04:00
Paul Kompfner
a3e540eb32 Rename examples/foundational/14s-function-calling-direct.py to examples/foundational/14t-function-calling-direct.py, since a new "14s" example was added 2025-06-30 10:44:55 -04:00
Paul Kompfner
e01c20be84 Remove unused import and tweak a comment 2025-06-30 10:36:47 -04:00
Paul Kompfner
ce3ca418c2 Unit tests for "direct" functions 2025-06-30 10:36:47 -04:00
Paul Kompfner
15b9a5faf6 Implement "direct functions", which allow you to bypass specifying a function configuration (as a FunctionSchema or in a provider-specific format) and use the Python function directly. Metadata is gathered automatically from the function signature and docstring. 2025-06-30 10:36:42 -04:00
Kwindla Hultman Kramer
3afa30894f Turn off thinking for Gemini models by default 2025-06-28 12:23:35 -07:00
Mark Backman
0ecfa827e6 Improve docstrings for services and processors (#2087) 2025-06-28 13:39:45 -04:00
Aleix Conchillo Flaqué
e1b0db75eb Merge pull request #2086 from pipecat-ai/aleix/watchdog-coroutine-helper
add watchdog coroutine helper
2025-06-27 11:10:10 -07:00
Aleix Conchillo Flaqué
b0c773189f AWSNovaSonicLLMService: fix error with watchdog_coroutine() 2025-06-27 11:09:40 -07:00
Aleix Conchillo Flaqué
3064326834 utils.asyncio: added watchdog_coroutine() 2025-06-27 11:09:40 -07:00
Mark Backman
c67e50fe34 Merge pull request #2084 from pipecat-ai/mb/update-evals-nova-sonic
Add 40-aws-nova-sonic to release evals list
2025-06-27 09:47:59 -04:00
Mark Backman
9d45e3eca1 Merge pull request #2079 from pipecat-ai/mb/fix-42-incorrect-import
fix: example 42 incorrect import
2025-06-27 09:47:47 -04:00
Mark Backman
43a24d15f6 Add 40-aws-nova-sonic to release evals list 2025-06-27 08:34:39 -04:00
Yousif Astarabadi
cafbda1668 remove openai from mcp run http example 2025-06-26 20:21:07 -07:00
Yousif Astarabadi
86c26fd64c moved needs_mcp_clean_schema to LLMService 2025-06-26 20:09:12 -07:00
Yousif Astarabadi
0c20668008 fixed linter errors 2025-06-26 20:08:26 -07:00
Yousif Astarabadi
92df8dc43c fix formatting 2025-06-26 20:08:23 -07:00
Yousif Astarabadi
9d5f5844b8 clean mcp schema for gemini models, update http mcp example to use gemini 2025-06-26 20:07:54 -07:00
Mark Backman
2cf31884d0 fix: example 42 incorrect import 2025-06-26 21:52:14 -04:00
Aleix Conchillo Flaqué
19354c6f2d Merge pull request #2078 from pipecat-ai/aleix/hotfix-0.0.73
just a quick hotfix for 0.0.73
2025-06-26 17:31:40 -07:00
Aleix Conchillo Flaqué
0b2079ad41 update CHANGELOG for 0.0.73 2025-06-26 17:02:12 -07:00
Aleix Conchillo Flaqué
5f18c3af70 OpenAIRealtimeLLMContext: fix circular dependency 2025-06-26 17:01:45 -07:00
Aleix Conchillo Flaqué
0a40285d43 update FrameProcessor.watchdog_timers_enabled references 2025-06-26 16:26:12 -07:00
Vanessa Pyne
5b1c328541 Merge pull request #2075 from pipecat-ai/vp-mcp-lint
mcp_service: lint
2025-06-26 15:25:39 -05:00
vipyne
37929533af mcp_service: lint 2025-06-26 15:00:20 -05:00
Vanessa Pyne
3b92113680 Merge pull request #2030 from yousifa/mcp-streaming-http
MCPClient streamable_http transport support
2025-06-26 14:57:31 -05:00
Yousif
46b52cb9bb Merge branch 'main' into mcp-streaming-http 2025-06-26 12:30:43 -07:00
Mark Backman
f0bcc9d9ba Add MCPClient docstrings. Removed google specific cleanup, changed example to openai 2025-06-26 12:29:45 -07:00
Yousif Astarabadi
1cac028bfe example using http transport for mcp client 2025-06-26 12:16:35 -07:00
Yousif Astarabadi
4956886819 updated error message with StreamableHttpParameters 2025-06-26 12:16:28 -07:00
Yousif Astarabadi
c720cfc7c7 updated streamablehttp to use StreamableHttpParameters type 2025-06-26 12:16:26 -07:00
Yousif Astarabadi
8fcef5628f added streamablehttp support, bumped mcp version, added additional headers and streamable_http params to MCPClient 2025-06-26 12:16:19 -07:00
Aleix Conchillo Flaqué
c4a72802f0 Merge pull request #2074 from pipecat-ai/aleix/pipecat-0.0.72
update CHANGELOG for 0.0.72
2025-06-26 12:10:14 -07:00
Aleix Conchillo Flaqué
917394803c update CHANGELOG for 0.0.72 2025-06-26 11:42:52 -07:00
Mark Backman
01040ddcdd Merge pull request #2071 from pipecat-ai/mb/services-docstrings-update
Add/update docstrings to LLM services
2025-06-26 14:42:32 -04:00
Aleix Conchillo Flaqué
7947497f7e Merge pull request #2073 from a6kme/patch-1
Start HeartBeat when all processors have processed StartFrame
2025-06-26 11:34:46 -07:00
Aleix Conchillo Flaqué
539ca5856f Merge pull request #2072 from pipecat-ai/aleix/utils-watchdog-cleanup
utils(asyncio): simplify watchdog helpers
2025-06-26 11:29:21 -07:00
Abhishek
89c801f82c Start HeartBeat when all processors have processed StartFrame
Some of the processors like STTService and TTSService don't push StartFrame ahead in the pipeline, unless they have connected with their service providers. This delays StartFrame in downstream processors. 

If we receive HeartBeat frame before StartFrame, we will get AttributeError `'Processor' object has no attribute '_FrameProcessor__input_queue'`. 

Idea is to start HeartBeats after StartFrame has been processed by all the Processors in the pipeline.
2025-06-26 23:28:37 +05:30
Aleix Conchillo Flaqué
3de4f22d34 utils(asyncio): simplify watchdog helpers 2025-06-26 09:40:42 -07:00
Mark Backman
0e4d2be98c Update AzureRealtimeBetaLLMService docstrings 2025-06-26 12:12:00 -04:00
Mark Backman
d8ce108ccd Update OpenAIRealtimeBetaLLMService docstrings 2025-06-26 12:06:47 -04:00
Mark Backman
d123cd4b2b Update GeminiMultimodalLiveLLMService docstrings 2025-06-26 11:47:30 -04:00
Aleix Conchillo Flaqué
4d34aa7cd6 Merge pull request #2069 from pipecat-ai/aleix/utils-asyncio-package
move things to new utils.asyncio package
2025-06-26 08:26:47 -07:00
Aleix Conchillo Flaqué
b860e94582 move things to new utils.asyncio package 2025-06-26 08:24:25 -07:00
Aleix Conchillo Flaqué
9d653e3788 Merge pull request #2068 from pipecat-ai/aleix/task-manager-dont-warn-reset-watchdog
TaskManager: don't warn on reset_watchdog()
2025-06-26 08:23:51 -07:00
Mark Backman
9e518cf2ba Update AWSNovaSonicLLMService docstrings 2025-06-26 11:21:18 -04:00
Mark Backman
2856372ad6 Update TogetherLLMService docstrings 2025-06-26 11:01:35 -04:00
Mark Backman
efbf574613 Update SambaNovaLLMService docstrings 2025-06-26 11:00:40 -04:00
Mark Backman
c018eb2f0e Update QwenLLMService docstrings 2025-06-26 10:57:42 -04:00
Mark Backman
d7bfe54b7c Update PerplexityLLMService docstrings 2025-06-26 10:56:48 -04:00
Mark Backman
137282b7a9 Update OpenRouterLLMService docstrings 2025-06-26 10:53:42 -04:00
Mark Backman
769f8c8f34 Update OpenPipeLLMService docstrings 2025-06-26 10:53:05 -04:00
Mark Backman
8b8a37ae7c Update OLLamaLLMService docstrings 2025-06-26 10:48:19 -04:00
Mark Backman
56e2b006f5 Update NimLLMService docstrings 2025-06-26 10:47:26 -04:00
Mark Backman
79cca05e43 Update GroqLLMService docstrings 2025-06-26 10:46:07 -04:00
Mark Backman
166c8e8e82 Update GrokLLMService docstrings 2025-06-26 10:39:46 -04:00
Mark Backman
9b64d2c325 Update GoogleLLMService docstrings 2025-06-26 10:37:22 -04:00
Mark Backman
03e3e9fae9 Update FireworksLLMService docstrings 2025-06-26 10:28:35 -04:00
Mark Backman
65234ae41a Update DeepSeekLLMService docstrings 2025-06-26 10:27:36 -04:00
Mark Backman
3828df8cf9 Update CerebrasLLMService docstrings 2025-06-26 10:26:42 -04:00
Mark Backman
9cbe85bf99 Update AzureLLMService docstrings 2025-06-26 10:25:17 -04:00
Mark Backman
7bf805b829 Update AWSBedrock docstrings 2025-06-26 10:23:40 -04:00
Mark Backman
990ee436e1 Add Anthropic docstrings 2025-06-26 07:42:22 -04:00
Mark Backman
1cd42066a6 Merge pull request #2067 from pipecat-ai/mb/update-docstrings-for-ref-docs
Update base service class docstrings for better docs auto-generation
2025-06-26 07:07:59 -04:00
Filipi da Silva Fuchter
ba43558049 Merge pull request #2066 from pipecat-ai/filipi/sentry_freeze_test
Enabling watchdog and sentry into the freeze-test
2025-06-26 08:01:51 -03:00
Mark Backman
951c8d34da Add special case handling for STT, TTS, LLM 2025-06-26 00:15:09 -04:00
Mark Backman
ac61139243 Add OpenAI LLM docstrings 2025-06-26 00:06:57 -04:00
Mark Backman
5b8f1fe3e3 Add Cartesia TTS docstrings 2025-06-25 23:50:55 -04:00
Mark Backman
0aa197e4a4 Add docstrings to DeepgramSTTService 2025-06-25 23:36:04 -04:00
Mark Backman
f04e058c96 Programmatically set the copyright date in docs 2025-06-25 23:29:37 -04:00
Mark Backman
6ef2ae12b7 Mock mcp imports 2025-06-25 23:29:37 -04:00
Mark Backman
fe6bbdaefe Skip dataclass attributes to remove duplicate entries 2025-06-25 23:29:37 -04:00
Mark Backman
cc66fddca9 Modify docs auto-gen rules to remove duplicate parameters listing 2025-06-25 23:29:37 -04:00
Mark Backman
04b70ddf13 Add MCPClient docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
bb3bb8d9c6 Improve WebsocketService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
f80f62c7d1 Add VisionService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
2007ae4317 Add ImageGenService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
a1e5a1eff4 Add AIService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
691999b402 Add AIServices docstring 2025-06-25 22:38:11 -04:00
Mark Backman
33f3a4cea1 Add TTSService docstrings 2025-06-25 22:38:11 -04:00
Mark Backman
ab1d2dbe6a Add STTService docstrings 2025-06-25 22:27:07 -04:00
Mark Backman
f622b281d0 Make call_start_function a private function in llm_service 2025-06-25 22:23:13 -04:00
Mark Backman
fb12bf9b4c Update LLMService docstrings 2025-06-25 22:23:13 -04:00
Aleix Conchillo Flaqué
27af50087e TaskManager: don't warn on reset_watchdog() 2025-06-25 17:29:45 -07:00
Filipi Fuchter
03502bed52 Enabling watchdog and sentry into the freeze-test 2025-06-25 20:53:30 -03:00
Aleix Conchillo Flaqué
27c7e2d150 Merge pull request #2063 from pipecat-ai/aleix/watchdog-timers-remove-start-watchdog
no need to call start_watchdog() only reset_watchdog()
2025-06-25 16:47:44 -07:00
Aleix Conchillo Flaqué
e81d387971 TaskManager: rely on add_done_callback() 2025-06-25 16:44:20 -07:00
Aleix Conchillo Flaqué
ef1ade3a71 allow enabling watchdog timers per frame processor or task 2025-06-25 16:36:19 -07:00
Aleix Conchillo Flaqué
4f032f5b96 update keepalive times depending on watchdog timers 2025-06-25 15:55:16 -07:00
Aleix Conchillo Flaqué
72cb967780 update CHANGELOG with watchdog timers updates 2025-06-25 15:55:16 -07:00
Aleix Conchillo Flaqué
357934a644 watchdog timers are disabled by default use enable_watchdog_timers 2025-06-25 15:55:16 -07:00
Aleix Conchillo Flaqué
327973657f TaskManager: remove wathcdog timer when main task is done 2025-06-25 11:26:21 -07:00
Aleix Conchillo Flaqué
d2730e6741 GooglSTTService: cleanup request queues 2025-06-25 11:12:32 -07:00
Aleix Conchillo Flaqué
eb5ecab104 no need to call start_watchdog() only reset_watchdog() 2025-06-25 11:12:32 -07:00
Mark Backman
202055a9b8 Merge pull request #2065 from pipecat-ai/mb/fix-configdict-openai-realtime
fix: add missing ConfigDict import in openai_realtime_beta/events
2025-06-25 11:40:35 -04:00
Mark Backman
7034a9e3fd fix: add missing ConfigDict import in openai_realtime_beta/events 2025-06-25 11:32:29 -04:00
Pete
1cf0b35ac1 Merge branch 'main' into groundingMetadata 2025-06-24 22:00:16 -04:00
Filipi da Silva Fuchter
8f7ed12262 Merge pull request #2061 from pipecat-ai/not_force_bot_speaking
Not forcing the bot resume speaking in case we receive no transcription.
2025-06-24 20:57:46 -03:00
Aleix Conchillo Flaqué
96b5320ef9 Merge pull request #2055 from pipecat-ai/aleix/fix-sentry-async
SentryMetrics: send metrics to sentry asynchronously
2025-06-24 16:32:01 -07:00
Filipi Fuchter
d5cd742237 Not forcing the bot resume speaking in case we receive no transcription. 2025-06-24 20:12:49 -03:00
Aleix Conchillo Flaqué
1f1da8942d SentryMetrics: send metrics to sentry asynchronously 2025-06-24 15:56:08 -07:00
Mark Backman
7953e1e9d9 Merge pull request #2054 from pipecat-ai/mb/telnyx-catch-hangup-error
fix: Telnyx, catch error when user has hung up the call first
2025-06-24 18:04:19 -04:00
Mark Backman
d6f7ecc0a3 fix: Telnyx, catch error when user has hung up the call first 2025-06-24 17:28:00 -04:00
Mark Backman
3eed316049 Merge pull request #2020 from snova-jorgep/snova-jorgep/sambanova-integration
Add Sambanova LLM and STT integration
2025-06-24 17:04:24 -04:00
Jorge Piedrahita Ortiz
851cf079c3 Merge branch 'main' into snova-jorgep/sambanova-integration 2025-06-24 16:00:28 -05:00
jhpiedrahitao
dfb0da32a9 fmt 2025-06-24 15:59:40 -05:00
Aleix Conchillo Flaqué
f450da57e5 Merge pull request #2056 from pipecat-ai/khk/fix-22d
Update google libraries used in google audio-in examples
2025-06-24 13:47:59 -07:00
Aleix Conchillo Flaqué
2ec6b6c995 Merge pull request #2060 from pipecat-ai/aleix/watchdog-timeout-secs
FrameProcessor: use watchdog_timeout_secs
2025-06-24 13:36:39 -07:00
Aleix Conchillo Flaqué
53b769a8ec FrameProcessor: use watchdog_timeout_secs 2025-06-24 13:33:47 -07:00
Filipi da Silva Fuchter
4f9adc173a Merge pull request #2004 from pipecat-ai/filipi/pipeline_freeze
Pipeline freeze improvements
2025-06-24 17:20:38 -03:00
Filipi Fuchter
dc4a58877e Fixing merge conflict. 2025-06-24 17:12:40 -03:00
Filipi Fuchter
a6243a6fe7 Merge branch 'main' into filipi/pipeline_freeze
# Conflicts:
#	CHANGELOG.md
#	src/pipecat/pipeline/task.py
#	src/pipecat/processors/frame_processor.py
#	src/pipecat/transports/base_input.py
2025-06-24 17:11:21 -03:00
Aleix Conchillo Flaqué
cf5f1b541a Merge pull request #2049 from pipecat-ai/aleix/introduce-watchdog-timers
introduce watchdog timers
2025-06-24 13:00:57 -07:00
Filipi Fuchter
70e6c48233 Mentioning the fixes in the changelog. 2025-06-24 16:56:46 -03:00
Filipi Fuchter
51f7d14d0a Merge branch 'main' into filipi/pipeline_freeze 2025-06-24 16:44:07 -03:00
Filipi Fuchter
4853d5d1fc Handling the case where user stopped speaking but no new aggregation received. 2025-06-24 16:42:10 -03:00
Aleix Conchillo Flaqué
076a8938f0 add start_watchdog/reset_watchdog to tasks 2025-06-24 11:56:20 -07:00
Aleix Conchillo Flaqué
5a3457ba33 introduce task watchdog timers 2025-06-24 11:56:20 -07:00
Aleix Conchillo Flaqué
2fc224384d Merge pull request #2059 from pipecat-ai/aleix/heartbeatframe-control-frames
HeartbeatFrames are now control frames
2025-06-24 11:55:18 -07:00
Aleix Conchillo Flaqué
a4e6ea5a3f HeartbeatFrames are now control frames 2025-06-24 11:27:39 -07:00
Vanessa Pyne
d3c211f293 Merge pull request #2058 from pipecat-ai/vp-mcp-sse-up
follow up to #1887 - proper MCP SSE support
2025-06-24 13:06:01 -05:00
vipyne
20047c369e mcp: update examples to use SseServerParameter 2025-06-24 12:58:39 -05:00
vipyne
dd1ff237a8 lint mcp_service 2025-06-24 12:58:33 -05:00
Vanessa Pyne
39d80d0b0e Merge pull request #1887 from ezun-kim/feat/mcp-sse-params
Fix SSE server connection handling for MCP client
2025-06-24 12:58:05 -05:00
Kwindla Hultman Kramer
7a48316534 update google libraries used in google audio-in examples 2025-06-24 09:52:04 -07:00
Filipi da Silva Fuchter
031a93ac46 Merge pull request #2053 from pipecat-ai/sentry_dsn_environment_variable
Creating an environment variable for sentry dsn.
2025-06-24 12:10:20 -03:00
Mark Backman
ea6cc1aa95 Merge pull request #2052 from pipecat-ai/mb/11labs-keepalive
Send context_id when available in ElevenLabsTTSService keepalive message
2025-06-24 11:07:07 -04:00
Filipi Fuchter
365260ec44 Creating an environment variable for sentry dsn. 2025-06-24 11:57:14 -03:00
Mark Backman
2eb244c80a Send context_id when available in ElevenLabsTTSService keepalive message 2025-06-24 10:52:49 -04:00
Mark Backman
aee3011d61 Merge pull request #2037 from pipecat-ai/mb/11labs-close-context
Fix: Correctly close the context for ElevenLabsTTSService
2025-06-24 07:44:22 -04:00
Aleix Conchillo Flaqué
40496e7b0f Merge pull request #2034 from pipecat-ai/khk/pause-frames
small fix for processor pause/resume frames
2025-06-23 17:08:41 -07:00
Kwindla Hultman Kramer
6b24f89fa7 small fix for processor pause/resume frames 2025-06-23 16:44:32 -07:00
Filipi Fuchter
2097800042 Allowing to clear the turn analyser 2025-06-23 18:50:37 -03:00
Filipi Fuchter
6739318e68 Forcing user stopped speaking due to timeout to receive audio frame! 2025-06-23 18:50:02 -03:00
Filipi Fuchter
d0bd563d42 Logging the BaseException inside the cancel_task. 2025-06-23 18:48:44 -03:00
Filipi Fuchter
74280829fc Fixed an issue with the FastAPIWebsocketClient to disconnect in case the websocket is already closed. 2025-06-23 18:48:03 -03:00
Filipi Fuchter
3fde8880f2 Fixed a couple of places inside the FrameProcessor where we should not raise the exceptions. 2025-06-23 18:47:54 -03:00
Filipi Fuchter
98d39e0d38 Logging the last 10 frames received in case idle timeout is detected. 2025-06-23 18:47:17 -03:00
Filipi Fuchter
c9cebb5ffe Created an example for testing the bot and try to create freezing conditions. 2025-06-23 18:46:58 -03:00
Mark Backman
f52ac6e99c Merge pull request #1998 from pipecat-ai/mb/fix-38-smart-turn-fal 2025-06-23 17:15:29 -04:00
Mark Backman
787a6b1c6a Merge pull request #2038 from pipecat-ai/mb/openai-realtime-model-update
Update OpenAIRealtimeBetaLLMService model to gpt-4o-realtime-preview-…
2025-06-23 16:30:31 -04:00
Mark Backman
d00a91074e Update OpenAIRealtimeBetaLLMService model to gpt-4o-realtime-preview-2025-06-03 2025-06-23 16:26:42 -04:00
Mark Backman
4e11497a38 Merge pull request #2048 from thibaudbrg/patch-1
Fix missing video_in_enabled in vision bot.py for Moondream template
2025-06-23 16:11:50 -04:00
Tibo
0443d5202a Fix missing video_in_enabled in vision bot.py for Moondream template
The parameter video_in_enabled=True was missing in DailyParams, which prevented image capture 
from working. Without this parameter, UserImageRequestFrame would be sent but no actual image data would be captured from participants.

This fix enables the "Let me take a look" functionality to work as 
intended by allowing the transport to capture video frames for vision processing with Moondream.
2025-06-23 21:17:41 +02:00
Mark Backman
633c25cb13 Merge pull request #2039 from pipecat-ai/mb/remove-lang-validation
OpenAIRealtimeBetaLLMService accepts language for all InputAudioTrans…
2025-06-23 14:41:09 -04:00
jhpiedrahitao
d07f45132f update changelog 2025-06-23 12:54:00 -05:00
jhpiedrahitao
a51280afa6 add 13 and 14 type foundational examples for sambanova iontegration 2025-06-23 12:53:32 -05:00
Jorge Piedrahita Ortiz
be14eb2460 Merge branch 'pipecat-ai:main' into snova-jorgep/sambanova-integration 2025-06-23 12:23:00 -05:00
jhpiedrahitao
e26dbffcbe update sambanova init imports 2025-06-23 12:22:08 -05:00
Mark Backman
59992fd24a Merge pull request #2044 from pipecat-ai/mb/daily-rest-docstring
Add missing arg docstring in DailyRESTHelper
2025-06-23 11:24:44 -04:00
Mark Backman
455362ccaf Merge pull request #2022 from pipecat-ai/mb/turn-tracking-end-cancel-frame
TurnTrackingObserver ends turn upon seeing EndFrame, CancelFrame
2025-06-23 11:24:27 -04:00
Mark Backman
16c0e2460b TurnTrackingObserver ends turn upon seeing EndFrame, CancelFrame 2025-06-23 11:08:51 -04:00
Matej Marinko
c54084b7a4 Fix deadlock on STT service stop 2025-06-23 14:18:29 +02:00
Mark Backman
92246f7125 Add missing arg docstring in DailyRESTHelper 2025-06-22 13:44:59 -04:00
Pete
e3fe040017 Update gemini.py 2025-06-21 14:43:15 -04:00
Pete
ae5e3e2dc4 Merge branch 'main' into groundingMetadata 2025-06-21 12:16:32 -04:00
Pete
77378d2779 Merge branch 'pipecat-ai:main' into groundingMetadata 2025-06-21 12:08:49 -04:00
Pete
4106f0dabe Merge branch 'pipecat-ai:main' into main 2025-06-21 10:54:25 -04:00
Mark Backman
7737335ec9 OpenAIRealtimeBetaLLMService accepts language for all InputAudioTranscription models 2025-06-21 10:08:46 -04:00
Mark Backman
5cc9b7e0d1 Fix: Correctly close the context for ElevenLabsTTSService 2025-06-20 15:47:03 -04:00
Mark Backman
8c6a441064 Merge pull request #2035 from smokyabdulrahman/feat/aws-polly-lexicon-names-support
Support AWS Polly Lexicon Names parameter
2025-06-20 10:03:27 -04:00
Alrahma
fddc058ce2 add CHANGELOG entry 2025-06-20 14:15:24 +01:00
Alrahma
89750086c5 Support AWS Polly Lexicon Names parameter
Documentation reference
[AWS Managing
Lexicons](https://docs.aws.amazon.com/polly/latest/dg/managing-lexicons.html)
2025-06-20 09:47:46 +01:00
Aleix Conchillo Flaqué
e69406c7e2 Merge pull request #2032 from pipecat-ai/aleix/aws-nova-sonic-function-calls
AWSNovaSonicLLMService: fix function calling
2025-06-19 14:42:47 -07:00
Aleix Conchillo Flaqué
878ae42d84 AWSNovaSonicLLMService: fix function calling 2025-06-19 14:26:34 -07:00
Aleix Conchillo Flaqué
d34ebfc126 Merge pull request #2027 from pipecat-ai/aleix/task-on-idle-timeout-repeated
PipelineTask: fix repeated on_idle_timeout
2025-06-19 14:13:10 -07:00
Aleix Conchillo Flaqué
028f7b2d65 PipelineTask: fix repeated on_idle_timeout 2025-06-19 09:14:10 -07:00
Mark Backman
0aa3ec50f2 Merge pull request #2023 from pipecat-ai/mb/allow-interruptions-true
allow_interruptions=True
2025-06-19 10:24:53 -04:00
Mark Backman
9146def21b Update examples to use default allow_interruptions, fixes to align examples 2025-06-19 10:07:32 -04:00
Aleix Conchillo Flaqué
ebb23a5a8c Merge pull request #2024 from pipecat-ai/aleix/audio-buffer-processor-sync-issues
AudioBufferProcessor: treat all streams as intermittent
2025-06-18 18:26:38 -07:00
Aleix Conchillo Flaqué
b118082984 AudioBufferProcessor: treat all streams as intermittent
This fixes an issue with STTMuteFilter that prevents user audio to be pushed
downstream.
2025-06-18 18:23:31 -07:00
Mark Backman
b5c0ac5f25 allow_interruptions=True 2025-06-18 20:33:40 -04:00
Filipi da Silva Fuchter
dc78e874af Merge pull request #2021 from pipecat-ai/gladia_stt_improvements_changelog
Adding the GladiaSTTService improvements in the changelog.
2025-06-18 18:25:36 -03:00
Filipi Fuchter
c30bde0a2b Adding the GladiaSTTService improvements in the changelog. 2025-06-18 16:19:58 -03:00
Filipi da Silva Fuchter
171597fbe9 Merge pull request #1952 from jqueguiner/feat/gladia-auto-reconnect
feat: Enhance GladiaSTTService with reconnection and audio buffer management features
2025-06-18 16:14:58 -03:00
jhpiedrahitao
fae2d272d5 fmt 2025-06-18 10:53:06 -05:00
jhpiedrahitao
03a067d3e6 add sambanova llm and stt 2025-06-18 10:50:42 -05:00
Mark Backman
f5d028f3b3 Merge pull request #2017 from pipecat-ai/mb/fix-11labs-voice-settings
fix: ElevenLabsTTSService voice settings not being sent
2025-06-18 09:56:46 -04:00
Mark Backman
e5b7dbba90 fix: ElevenLabsTTSService voice settings not being sent 2025-06-18 09:49:17 -04:00
Filipi da Silva Fuchter
7ffba1e0b3 Merge pull request #1950 from pipecat-ai/filipi/tavus_custom_tracks
Sending audio to Tavus using custom tracks
2025-06-18 07:57:19 -03:00
Filipi Fuchter
72cdbf0b78 Mentioning the Tavus improvements in the changelog. 2025-06-18 07:46:04 -03:00
Filipi Fuchter
8b4a86f629 Ignoring the audio level when creating the custom tracks. 2025-06-18 07:45:54 -03:00
Filipi Fuchter
fa15e64fc9 Test script that mimics the behavior expected to be supported by Tavus. 2025-06-18 07:45:38 -03:00
Filipi Fuchter
564f064c71 Refactoring TavusVideoService to send audio using WebRTC audio tracks instead of app-messages. 2025-06-18 07:44:51 -03:00
Filipi Fuchter
4062c7afa0 Refactoring TavusTransport to send audio using WebRTC audio tracks instead of app-messages. 2025-06-18 07:44:38 -03:00
Jean-Louis Queguiner
8071c4ba1c Merge branch 'main' into feat/gladia-auto-reconnect 2025-06-18 08:57:21 +02:00
jqueguiner
3d0ffbc832 🐛 (stt.py): handle websocket connection closure gracefully and log warnings
♻️ (stt.py): refactor reconnection logic into a separate method for clarity
 (stt.py): implement exponential backoff for reconnection attempts to improve reliability
2025-06-18 08:52:43 +02:00
Filipi da Silva Fuchter
1cac94bf97 Merge pull request #1925 from pipecat-ai/filipi/websocket_transport_example_twilio
Websocket client web app to test Twilio.
2025-06-17 16:24:18 -03:00
Mark Backman
c94c51d44f Fix: 38-smart-turn-fal 2025-06-17 15:10:52 -04:00
Mark Backman
96958933af Merge pull request #2016 from pipecat-ai/aleix/example-params-allow-async-objects
examples: create transport params async
2025-06-17 15:08:37 -04:00
Filipi Fuchter
2300c2632e Refactoring how we are organizing the twilio chatbot examples and improving the readmes 2025-06-17 16:08:35 -03:00
Filipi Fuchter
cbd0529674 Merge branch 'main' into filipi/websocket_transport_example_twilio 2025-06-17 15:54:31 -03:00
Filipi da Silva Fuchter
5614e35ac4 Merge pull request #2015 from pipecat-ai/bumping_pipecat_required_versions
Bumping pipecat-ai-krisp required version
2025-06-17 15:42:20 -03:00
Aleix Conchillo Flaqué
c11172caba examples: create transport params async 2025-06-17 11:37:42 -07:00
Filipi Fuchter
11b6e409bb Bumping pipecat-ai-krisp required version 2025-06-17 15:22:31 -03:00
Aleix Conchillo Flaqué
3dca95aa3c Merge pull request #2014 from pipecat-ai/aleix/daily-python-0.19.3
update daily-python to 0.19.3
2025-06-17 10:10:23 -07:00
Aleix Conchillo Flaqué
7ddc706434 update daily-python to 0.19.3 2025-06-17 09:30:28 -07:00
Aleix Conchillo Flaqué
20eebb08e9 update CHANGELOG with AWSTranscribeSTTService Polish support 2025-06-16 10:34:56 -07:00
Aleix Conchillo Flaqué
4abf41b85a Merge pull request #2011 from wuodar/wuodar/polish-lang-aws-transcribe
Support polish language in Amazon Transcribe
2025-06-16 10:33:55 -07:00
Aleix Conchillo Flaqué
e426f7ee7c Merge pull request #2012 from pipecat-ai/aleix/frame-pause-resume-frames
FrameProcessor: handle new FrameProcessorPauseFrame/FrameProcessorResumeFrame
2025-06-16 10:32:38 -07:00
Aleix Conchillo Flaqué
14dc6a7984 FrameProcessor: handle new FrameProcessorPauseFrame/FrameProcessorResumeFrame 2025-06-16 10:31:33 -07:00
Mark Backman
e0a24a3f07 Merge pull request #2006 from pipecat-ai/mb/expose-function-calls-in-progress
Expose has_function_calls_in_progress property
2025-06-16 12:49:07 -04:00
Mark Backman
d1bee22d73 Expose has_function_calls_in_progress property 2025-06-16 12:45:16 -04:00
Jon Taylor
d73f7908f2 Merge pull request #2008 from pipecat-ai/khk/groq-audio
fix groq wav file header parsing
2025-06-16 14:09:09 +01:00
Aleix Conchillo Flaqué
a4ea0d2b82 dev-requirements: update pyright 1.1.400 and ruff 0.11.13 2025-06-15 21:05:03 -07:00
Kacper Włodarczyk
e2c15169b8 feat: support polish language in Amazon Transcribe 2025-06-15 21:44:06 +02:00
Kwindla Hultman Kramer
fe16ed3c73 added changelog entry 2025-06-15 10:49:40 -07:00
Filipi Fuchter
80ce097f90 Using relative URL for the websocket. 2025-06-15 10:49:25 -03:00
Filipi Fuchter
eceaf8a46b Making the path to the web client relative 2025-06-14 21:07:15 -03:00
Kwindla Hultman Kramer
1e3fa4a9c7 fix groq wav file header parsing 2025-06-14 17:41:44 -04:00
Pete
2ed1ed6821 Merge branch 'pipecat-ai:main' into main 2025-06-14 16:23:27 -04:00
Filipi da Silva Fuchter
dc640a7591 Merge pull request #2001 from pipecat-ai/filipi/google_stt_reconnection_issue
Fixed an issue with `GoogleSTTService` where it was constantly reconnecting
2025-06-13 08:29:18 -03:00
Filipi Fuchter
1f072d182c Merge branch 'main' into filipi/google_stt_reconnection_issue
# Conflicts:
#	CHANGELOG.md
2025-06-13 08:26:00 -03:00
Mark Backman
1d64e04ed5 Merge pull request #2002 from pipecat-ai/mb/google-fix-ttfb
Fix: GoogleLLMService TTFB
2025-06-12 12:10:01 -04:00
Mark Backman
22f4f0b79e Update 14e example name 2025-06-12 11:45:59 -04:00
Mark Backman
69c63293fb fix: GoogleLLMService TTFB value 2025-06-12 11:43:27 -04:00
Filipi Fuchter
c1db13ceeb Fixed an issue with GoogleSTTService where it was constantly reconnecting before starting to receive audio from the user. 2025-06-12 12:07:33 -03:00
Matej Marinko
6d3a38842d Merge branch 'main' of github.com:pipecat-ai/pipecat 2025-06-12 11:32:38 +02:00
Filipi Fuchter
70eadee0aa Bumping the @pipecat-ai/websocket-transport dependency. 2025-06-11 18:30:16 -03:00
Pete
7360f79413 Merge branch 'pipecat-ai:main' into main 2025-06-11 13:16:19 -04:00
Aleix Conchillo Flaqué
228afe01ed Merge pull request #1993 from pipecat-ai/aleix/pipecat-0.0.71
update CHANGELOG for 0.0.71
2025-06-10 14:42:09 -07:00
Aleix Conchillo Flaqué
61a5154e49 update CHANGELOG for 0.0.71 2025-06-10 14:34:30 -07:00
Sunah Suh
d3df75aaa0 Add additional_span_attributes param to PipelineTask for extra otel… (#1992) 2025-06-10 17:32:24 -04:00
Aleix Conchillo Flaqué
c59180dd6e udpate CHANGELOG 2025-06-10 14:23:02 -07:00
Mark Backman
e4c2310632 Merge pull request #1990 from pipecat-ai/mb/more-cartesia-stt
Add Cartesia STT docs link to README, fix set_model error
2025-06-10 17:19:11 -04:00
Aleix Conchillo Flaqué
e1735a2da1 Merge pull request #1991 from pipecat-ai/aleix/pipecat-0.0.70
update CHANGELOG for 0.0.70
2025-06-10 14:08:52 -07:00
Aleix Conchillo Flaqué
c101c9c8e1 update CHANGELOG for 0.0.70 2025-06-10 13:37:28 -07:00
Aleix Conchillo Flaqué
96dc162de5 Merge pull request #1988 from pipecat-ai/aleix/update-examples-22b
examples(22b): remove unnecessary parallel pipeline branch
2025-06-10 12:58:37 -07:00
Mark Backman
257dbe3104 Fix model param error 2025-06-10 15:14:47 -04:00
Mark Backman
cd98657e3c Add Cartesia STT docs link to README 2025-06-10 15:09:13 -04:00
Aleix Conchillo Flaqué
03eb22fe0a examples(22b): remove unnecessary parallel pipeline branch 2025-06-10 09:05:58 -07:00
Pete
8d55e13750 remove audio_transcriber from gemini.py
unecessary import removed.
2025-06-10 11:22:16 -04:00
Pete
737e8e79c9 Merge branch 'main' into groundingMetadata 2025-06-10 11:12:35 -04:00
Pete
4d977fede0 Merge branch 'main' into main 2025-06-10 11:07:59 -04:00
Filipi Fuchter
0073a868d4 Websocket client web app to test Twilio. 2025-06-10 11:34:02 -03:00
Mark Backman
0bb61d72ab Merge pull request #1984 from pipecat-ai/mb/cartesia-stt-cleanup
CartesiaSTTService cleanup
2025-06-10 10:30:18 -04:00
Mark Backman
f758508a82 Merge pull request #1978 from pipecat-ai/mb/rime-languages
Add languages to RimeHttpTTSService, extend lang support to German an…
2025-06-10 10:27:15 -04:00
Mark Backman
69d0218d7e Add languages to RimeHttpTTSService, extend lang support to German and French 2025-06-10 10:20:41 -04:00
Aleix Conchillo Flaqué
eb5e5ab1df update CHANGELOG 2025-06-09 20:22:39 -07:00
Aleix Conchillo Flaqué
093697906c Merge pull request #1954 from WebinarGeek/wg/gladia-informal-translations
Gladia informal translations
2025-06-09 20:21:40 -07:00
Aleix Conchillo Flaqué
efe96b7ed1 Merge pull request #1986 from pipecat-ai/aleix/daily-python0.19.2
pyproject: update daily-python to 0.19.2
2025-06-09 19:46:14 -07:00
Aleix Conchillo Flaqué
7ecdd41ab9 pyproject: update daily-python to 0.19.2 2025-06-09 17:29:07 -07:00
Mark Backman
aec70d61e9 CartesiaSTTService cleanup 2025-06-09 15:20:57 -04:00
Mark Backman
2efac13344 Merge pull request #1983 from pipecat-ai/mb/exotel-resampling
Resample audio in ExotelFrameSerializer
2025-06-09 14:41:08 -04:00
Mark Backman
15aeb11c36 Resample audio in ExotelFrameSerializer 2025-06-09 14:02:25 -04:00
Mark Backman
e705f4d984 Merge pull request #1972 from Vaibhav159/vl_add_exotel_serializer
adding exotel serializer
2025-06-09 13:54:26 -04:00
Shrey Gupta
96fa62fdfe [Add] Support for Cartesia AI STT (#1982) 2025-06-09 14:51:01 -03:00
Mark Backman
845c70797a Merge pull request #1975 from pipecat-ai/mb/11labs-flush-context-reset
fix: ElevenLabsTTSService reset context when flushing audio
2025-06-09 13:21:25 -04:00
kompfner
16048956c3 Merge pull request #1956 from pipecat-ai/pk/make-add-observer-sync
Make `PipelineTask.add_observer()` synchronous. This allows callers t…
2025-06-09 13:19:34 -04:00
Mark Backman
cf2f4b5902 fix: ElevenLabsTTSService reset context when flushing audio 2025-06-09 13:17:55 -04:00
marcus-daily
db46f33f34 Update to Android transports 0.3.7 2025-06-09 17:09:59 +01:00
Aleix Conchillo Flaqué
25d1515daf Merge pull request #1979 from pipecat-ai/aleix/buffer-tts-before-playback
buffer audio from TTS service before pushing frames
2025-06-09 08:43:55 -07:00
Paul Kompfner
a3469cd59f Add CHANGELOG entry describing PipelineTask.add_observer() being made synchronous 2025-06-09 11:37:30 -04:00
Paul Kompfner
513ce26200 Add unit test exercising synchronous usage of PipelineTask.add_observer() right after initializing the PipelineTask (before anything else is done with it) 2025-06-09 11:30:24 -04:00
Paul Kompfner
1cd96f94ff Make PipelineTask.add_observer() synchronous. This allows callers to call it before run()ning the PipelineTask first. Without this change, if they tried to do that, they would get an error because the TaskManager's event loop hadn't been set yet. 2025-06-09 11:30:24 -04:00
Aleix Conchillo Flaqué
901dd041f0 buffer audio from TTS service before pushing frames 2025-06-09 07:29:09 -07:00
Vaibhav159
a2ee94651e removing resampling 2025-06-07 12:53:55 +05:30
Aleix Conchillo Flaqué
abdce063f1 Merge pull request #1973 from pipecat-ai/aleix/assemblyai-yield-none
AssemblyAISTTService: yield None instead of Frame()
2025-06-06 15:12:16 -07:00
Aleix Conchillo Flaqué
a33ce5e4bf AssemblyAISTTService: yield None instead of Frame() 2025-06-06 14:41:01 -07:00
Filipi da Silva Fuchter
c9575eaef9 Merge pull request #1911 from pipecat-ai/filipi/websocket_transport_example
Adding support to WebsocketTransport
2025-06-06 17:25:07 -03:00
Filipi Fuchter
1e74476a71 Refactoring to use the observer inside the pipelinetask, and moving to start the bot inside on_client_ready. 2025-06-06 17:22:50 -03:00
Filipi Fuchter
82935884c4 Mentioning the new websocket example in the changelog. 2025-06-06 17:17:11 -03:00
Filipi Fuchter
d774a23768 Improving the readme to mention that can choose which server websocket to use. 2025-06-06 17:12:05 -03:00
Filipi Fuchter
e9f041e170 Removing the old websocket-server example 2025-06-06 17:09:01 -03:00
Filipi Fuchter
1f51b6e4f1 A Pipecat example demonstrating how to use WebsocketTransport 2025-06-06 17:08:43 -03:00
Filipi Fuchter
028650249c Adding support in ProtobufFrameSerializer to deserialize MessageFrame. 2025-06-06 17:07:39 -03:00
Vaibhav159
534197239f updating changelog 2025-06-07 00:24:54 +05:30
Vaibhav159
d2f4bb574c adding exotel serializer 2025-06-07 00:22:41 +05:30
jqueguiner
25ff8ef37b (config.py): add new configuration options for lip-sync optimization, context adaptation, and additional context to enhance translation accuracy
♻️ (stt.py): increase default max buffer size from 5MB to 20MB to accommodate larger audio data
♻️ (stt.py): simplify audio sending logic by removing chunking and sending the entire buffered audio at once for improved performance
2025-06-05 16:51:29 -07:00
Aleix Conchillo Flaqué
07fb1a2c39 Merge pull request #1967 from counterleft/unused-http-client-session
Remove instantiation of unused http client session from certain examples
2025-06-05 12:59:01 -07:00
Aleix Conchillo Flaqué
581b800c43 Merge pull request #1961 from ken-kuro/patch-1
fix(piper-tts): typo
2025-06-05 12:57:58 -07:00
Aleix Conchillo Flaqué
30ca39287f Merge pull request #1962 from ken-kuro/patch-2
fix(fastapi_websocket): typo
2025-06-05 12:57:22 -07:00
Aleix Conchillo Flaqué
01fa9698de Merge pull request #1960 from pipecat-ai/aleix/disable-uvloop
disable uvloop by default and just let the user set it
2025-06-05 12:12:47 -07:00
Brian Mathiyakom
10bd969636 Remove instantiation of unused http client session
These examples don't make any HTTP requests with `session` so there
doesn't seem be a need to create one in the first place. Probably a
copy-paste from a previous example.
2025-06-05 11:45:13 -07:00
Kendrick Ha
f7761f2b61 fix(fastapi_websocket): typo 2025-06-05 13:55:28 +07:00
Kendrick Ha
49ff38a21f fix(piper-tts): typo 2025-06-05 13:50:56 +07:00
Aleix Conchillo Flaqué
8d161306c7 disable uvloop by default and just let the user set it 2025-06-04 21:25:06 -07:00
Vanessa Pyne
027a82dff1 Merge pull request #1958 from pipecat-ai/vp-livekit-fix
fix: transports/services/livekit.py typo
2025-06-04 12:27:47 -05:00
vipyne
cb409d58e0 fix: transports/services/livekit.py typo 2025-06-04 11:14:21 -05:00
Dan Berg
094e2f8151 Fix formatting 2025-06-03 17:21:51 +02:00
Dan Berg
71d121aeb9 Update CHANGELOG.md explaining informal on Gladia TranslationConfig 2025-06-03 17:15:29 +02:00
Dan Berg
b1a88af43c Add informal to Gladia TranslationConfig 2025-06-03 17:10:52 +02:00
Filipi da Silva Fuchter
f73eb4ebd9 Merge pull request #1949 from pipecat-ai/filipi/transport_destination_issue
Fixed transport destination issue
2025-06-03 08:41:34 -03:00
Filipi Fuchter
31ca9be299 Fixing missing await to self.reset. 2025-06-03 08:37:47 -03:00
jqueguiner
02cc6f3d56 Enhance GladiaSTTService with reconnection and audio buffer management features
- Added parameters for maximum reconnection attempts, reconnection delay, and maximum audio buffer size.
- Implemented automatic reconnection logic with exponential backoff.
- Introduced audio buffer management to handle audio data efficiently, including trimming excess data.
- Updated connection handling to ensure proper cleanup and management of WebSocket connections.
- Enhanced audio sending logic to support buffered audio transmission after reconnections.
2025-06-03 03:16:57 -07:00
Filipi Fuchter
1642c082d1 Describing the fix in the changelog. 2025-06-02 22:28:31 -03:00
Filipi Fuchter
892d213442 Fixing issue to keep the transport_destination. 2025-06-02 22:16:10 -03:00
Filipi Fuchter
fc24267e09 Waiting for the LLM response to reset. 2025-06-02 22:15:53 -03:00
Aleix Conchillo Flaqué
9b71bdc608 Merge pull request #1947 from pipecat-ai/aleix/pipecat-0.0.69
update CHANGELOG for 0.0.69
2025-06-02 12:51:51 -07:00
Aleix Conchillo Flaqué
310be89895 update CHANGELOG for 0.0.69 2025-06-02 12:07:50 -07:00
Aleix Conchillo Flaqué
71fbd57e12 Merge pull request #1938 from pipecat-ai/aleix/custom-interruption-strategies
allow custom interruption strategies
2025-06-02 12:05:50 -07:00
Aleix Conchillo Flaqué
ab4b48c823 examples(04a): fix daily_runner import 2025-06-02 12:01:26 -07:00
Aleix Conchillo Flaqué
532767cfa1 LLMUserContextAggregator: reset strategies when reseting the aggregator 2025-06-02 12:01:26 -07:00
Aleix Conchillo Flaqué
5512de3221 allow custom interruption strategies 2025-06-02 12:01:26 -07:00
Mark Backman
13546d5e8f Merge pull request #1946 from pipecat-ai/mb/fix-11labs-context
fix: Use AudioContextWordTTSService context methods in ElevenLabsTTSS…
2025-06-02 14:55:49 -04:00
Mark Backman
c6f1aa8086 fix: Use AudioContextWordTTSService context methods in ElevenLabsTTSService 2025-06-02 14:49:05 -04:00
Mark Backman
5606c47cb7 Merge pull request #1945 from pipecat-ai/mb/gemini-2.0
Reverting Gemini Live model back to gemini-2.0-flash-live-001
2025-06-02 14:25:30 -04:00
Filipi da Silva Fuchter
7f7cd96211 Merge pull request #1944 from pipecat-ai/fixing_tavus_transport
Adding the direction when pushing the frame.
2025-06-02 15:21:58 -03:00
Filipi Fuchter
b828bfd890 Adding the direction when pushing the BotStartedSpeaking and BotStoppedSpeaking frames. 2025-06-02 15:05:56 -03:00
Mark Backman
31d084eb78 Reverting Gemini Live model back to gemini-2.0-flash-live-001 2025-06-02 13:29:05 -04:00
Mark Backman
ab18b280e9 Merge pull request #1943 from pipecat-ai/mb/add-transcription-19-openai
Add a TranscriptProcessor to 19-openai-realtime-beta.py
2025-06-02 13:01:52 -04:00
Mark Backman
24e89c4081 Merge pull request #1936 from pipecat-ai/mb/fix-01-quickstart
Add daily to the foundational examples requirements.txt
2025-06-02 12:55:36 -04:00
Mark Backman
e129390f56 Add a TranscriptProcessor to 19-openai-realtime-beta.py 2025-06-02 11:38:49 -04:00
Mark Backman
4d7c87bb4c Merge pull request #1941 from pipecat-ai/vp-observer-up
chore: move observers arg in p2p-webrtc example; add deprecated to in…
2025-06-02 11:17:48 -04:00
Mark Backman
dac3f82a75 Merge pull request #1934 from counterleft/use-disconnect-on-small-webrtc-connection-example
Fix type checker error with missing function call in the small WebRTC transport example
2025-06-02 11:17:30 -04:00
Mark Backman
fd860921f1 Add daily to the foundational examples requirements.txt 2025-06-02 11:10:49 -04:00
vipyne
0482ccd48b chore: move observers arg in p2p-webrtc example; add deprecated to in line comments 2025-06-02 09:41:09 -05:00
Dominic Stewart
b8b1990617 Fix example env file (#1939)
* Fixed typo in the example env files
2025-06-02 15:12:43 +09:00
Dominic Stewart
70951b1198 Add simplified pstn examples (#1822)
* Add simplified pstn examples
* Add daily_twilio_sip_dial_out example
2025-06-02 14:50:21 +09:00
Mark Backman
6d24514ace Merge pull request #1937 from pipecat-ai/mb/fix-message-for-logging
fix: correctly display non-roman characters
2025-06-01 12:49:48 -04:00
Aleix Conchillo Flaqué
49915ceb84 Merge pull request #1683 from pipecat-ai/aleix/run-function-calls-sequentially
run function calls sequentially or in parallel
2025-06-01 09:47:35 -07:00
Mark Backman
925b13e337 fix: correctly display non-roman characters 2025-06-01 12:29:26 -04:00
Aleix Conchillo Flaqué
ef3143d558 LLMService: don't run function calls if none are given 2025-05-31 14:01:56 -07:00
Mark Backman
ed84637b55 Add additional function call for testing to 14e, 14r, 19, 19a, 26b 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
897a944478 examples(14,14a): add restaurant recommendation function call 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
d86343c38d examples: update to use on_function_calls_started event 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
297afdd126 LLMService: add new FunctionCallsStartedFrame 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
f0cbdc4e68 LLMService: add on_function_calls_started event 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
40b52cadde LLMService: s/FunctionCallLLM/FunctionCallFromLLM/ s/FunctionCallRunner/FunctionCallRunnerItem/ 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
04bf85ddfe LLMService: allow executing tasks sequentially and in parallel 2025-05-31 13:57:52 -07:00
Aleix Conchillo Flaqué
4809684a13 LLMService: cancel function calls on interruptions by default 2025-05-31 13:57:50 -07:00
Aleix Conchillo Flaqué
1eb50ad88f LLMService: pass LLM function calls all at once 2025-05-31 13:57:36 -07:00
Aleix Conchillo Flaqué
52569bcdb2 LLMService: don't allow running functions concurrently for now 2025-05-31 13:57:36 -07:00
Aleix Conchillo Flaqué
a50a407415 LLMService: run function calls sequentially 2025-05-31 13:57:35 -07:00
Aleix Conchillo Flaqué
9f223442c2 Merge pull request #1935 from pipecat-ai/aleix/script-evals
improve evals
2025-05-30 19:14:59 -07:00
Mark Backman
c647114bb9 Merge pull request #1927 from pipecat-ai/mb/gemini-tracing 2025-05-30 22:01:44 -04:00
Mark Backman
43719ec737 Update CHANGELOG 2025-05-30 21:36:39 -04:00
Mark Backman
8602557985 Refactor Gemini tracing to more closely match OpenAI Realtime, add TTFB metrics 2025-05-30 21:36:00 -04:00
Mark Backman
dd1f7d0875 Add tracking to OpenAI Realtime 2025-05-30 21:36:00 -04:00
Mark Backman
ec39e794d3 _handle_transcription 2025-05-30 21:36:00 -04:00
Mark Backman
7b1a937d4c Add tracing for Gemini Live 2025-05-30 21:36:00 -04:00
Mark Backman
0fd38d8115 Merge pull request #1931 from pipecat-ai/mb/num-words
Add support for interruption strategies
2025-05-30 21:14:02 -04:00
Mark Backman
7a4efc6212 Code review feedback 2025-05-30 21:09:15 -04:00
Brian Mathiyakom
2eb2c5a413 Use disconnect() because close() doesn't exist
SmallWebRTCConnection doesn't have a `close()`. There's a `_close()` but I assume that's private due to its naming. The closest function that uses `_close()` is `disconnect()`. I assume then, that the intended resource freeing function call should be to `disconnect()`.
2025-05-30 17:14:53 -07:00
Aleix Conchillo Flaqué
2fcfb0aa9f evals: don't use Deepgram's smart formatting 2025-05-30 16:55:55 -07:00
Aleix Conchillo Flaqué
f1df079512 evals: allow running a single eval 2025-05-30 16:55:55 -07:00
getchannel
8070e156d8 Add groundingMetadata events.py 2025-05-30 18:07:09 -04:00
Aleix Conchillo Flaqué
d77bedbafb evals: move scripts/release to script/evals and add README 2025-05-30 15:04:05 -07:00
getchannel
43c6f1f5cd Add groundingMetadata and logging gemini.py 2025-05-30 18:01:15 -04:00
getchannel
f53f5445ba Create 26g-gemini-multimodal-live-groundingMetadata.py 2025-05-30 17:36:36 -04:00
Mark Backman
b34c593c54 Add changelog entry 2025-05-30 16:48:42 -04:00
Mark Backman
62efbc3342 Add foundational example 42 2025-05-30 16:48:42 -04:00
Mark Backman
2d609a0bde Update LLmUserContextAggregator to conditionally push_aggregation 2025-05-30 16:48:42 -04:00
Mark Backman
6bc4b4a17f Update BaseInputTransport to not push StartInterruptionFrame when InterruptionConfig is set and bot is speaking 2025-05-30 16:48:42 -04:00
Mark Backman
b489e52080 Add InterruptionConfig 2025-05-30 16:15:20 -04:00
getchannel
7263d11ee4 update correct upload endpoint file_api.py 2025-05-30 13:41:55 -04:00
getchannel
f2d5b9ad69 Create 26f-gemini-multimodal-live-files-api.py
This is an example to test usage of the Files API integration. Specifically with the Gemini Multimodal Live Service.
2025-05-30 13:04:52 -04:00
getchannel
40c7e3c52c Update gemini.py 2025-05-30 12:19:40 -04:00
Aleix Conchillo Flaqué
a8aaeec52b Merge pull request #1926 from pipecat-ai/aleix/pause-base-input-transport
handle StopFrame in base input transport and stop pushing frames
2025-05-30 08:27:20 -07:00
Mark Backman
ad7eec181e Merge pull request #1923 from philipp-eisen/philipp/fix-ttfb_ms-implementation
Fix implementation of ttfb_ms metric
2025-05-30 11:03:42 -04:00
Aleix Conchillo Flaqué
b33897ffb9 SmallWebRTCTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
1c3d3f2f4b WebsocketClientTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
9a5a1edb6b WebsocketServerTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
f2eb869b02 FastAPIWebsocketTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
0c7e3cfcb2 LiveKitTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
24e19db29e TavusTransport: don't initialize if a second StartFrame is received 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
bc6d7b7bbd examples(phone-chatbot): don't show dangling tasks in first pipeline 2025-05-30 07:51:51 -07:00
Aleix Conchillo Flaqué
cad271068e BaseInputTransport: handle StopFrame and pause pushing frames 2025-05-30 07:51:49 -07:00
Philipp Eisen
3425293115 Ensure correct formatting 2025-05-30 15:41:45 +02:00
Mark Backman
20dbfec3a9 Merge pull request #1876 from m-ods/m-ods/assemblyai-universal-streaming
Update AssemblyAI Streaming STT
2025-05-30 08:55:43 -04:00
marcus-daily
170057a75a Updating dependency version 2025-05-30 12:38:32 +01:00
marcus-daily
b86b761e0b Fixing Yaml syntax 2025-05-30 12:38:32 +01:00
marcus-daily
da0d2f0266 Small WebRTC transport demo app for Android 2025-05-30 12:38:32 +01:00
Martin Schweiger
321ea27c34 changelog entry 2025-05-30 17:15:58 +08:00
Philipp Eisen
b712e6b9aa Switch ttfb metric to use seconds instead 2025-05-30 11:13:26 +02:00
Martin Schweiger
b3652e6527 set vad_force_turn_endpoint to True by default 2025-05-30 17:07:54 +08:00
Philipp Eisen
bc97f397ef Switch to using float instead of int, but keep ms 2025-05-30 10:27:20 +02:00
Martin Schweiger
e5da3f6e68 add tracing 2025-05-30 10:55:19 +08:00
Martin Schweiger
8400539acf set formatted_finals true by default 2025-05-30 10:40:01 +08:00
Martin Schweiger
b5eac8dfed add message to TranscriptionFrame result 2025-05-30 10:39:07 +08:00
Martin Schweiger
ba312b5591 Merge branch 'main' into m-ods/assemblyai-universal-streaming 2025-05-30 10:29:49 +08:00
Martin Schweiger
f23572b318 Merge branch 'main' into m-ods/assemblyai-universal-streaming 2025-05-30 10:11:02 +08:00
Martin Schweiger
db838634e7 fix: double finals bug 2025-05-30 10:00:31 +08:00
Martin Schweiger
7f2e848a5c use FrameProcessor class methods 2025-05-30 09:40:46 +08:00
Martin Schweiger
096e854d50 remove .dict() 2025-05-30 09:31:20 +08:00
Martin Schweiger
3ffe8b3155 remove parse_obj 2025-05-30 09:29:51 +08:00
Mark Backman
a471f49b61 Merge pull request #1889 from pipecat-ai/mb/add-dtmf-aggregator
Add DTMFAggregator
2025-05-29 17:13:28 -04:00
Mark Backman
4d2a02f318 Refactor to create task on StartFrame, also cleanup 2025-05-29 17:10:54 -04:00
Mark Backman
0bec7db03b Emit a BotInterruptionFrame when the first keypress of a sequence is received 2025-05-29 17:07:18 -04:00
Mark Backman
74827f983f Add tests, improve frame timing 2025-05-29 17:07:18 -04:00
Mark Backman
0ed46f457e Add DTMFAggregator 2025-05-29 17:07:17 -04:00
Aleix Conchillo Flaqué
36b731be73 Merge pull request #1915 from pipecat-ai/aleix/uvloop-event-loop
use uvloop as the new event loop on Linux and macOS
2025-05-29 14:06:44 -07:00
Aleix Conchillo Flaqué
62fbdd4e81 Merge pull request #1922 from pipecat-ai/aleix/output-transport-cleanup
output transports cleanup
2025-05-29 14:06:17 -07:00
Aleix Conchillo Flaqué
ca7b0650c2 examples: capture camera or screen. allow setting framerate 2025-05-29 13:16:44 -07:00
Aleix Conchillo Flaqué
67dd146038 output transports cleanup 2025-05-29 13:16:44 -07:00
Aleix Conchillo Flaqué
fb66df2efd use uvloop as the new event loop on Linux and macOS 2025-05-29 11:24:21 -07:00
Aleix Conchillo Flaqué
2395ca0057 Merge pull request #1921 from pipecat-ai/aleix/transcription-frame-result
add STT result field to TranscriptionFrame/InterimTranscriptionFrame
2025-05-29 11:22:38 -07:00
Aleix Conchillo Flaqué
d203789490 add STT result field to TranscriptionFrame/InterimTranscriptionFrame 2025-05-29 11:21:44 -07:00
Aleix Conchillo Flaqué
7ea0e31cd4 Merge pull request #1924 from pipecat-ai/aleix/new-examples-package
add new pipecat.examples package and make runner public
2025-05-29 11:18:22 -07:00
Mark Backman
d3bf13a503 Merge pull request #1917 from pipecat-ai/mb/fix-twilio-chatbot-client
fix: update websocket_client to use FrameProcessorSetup.task_manager
2025-05-29 14:05:27 -04:00
Mark Backman
ea91970499 Update the twilio-chatbot client to work with the updated server, which requires call_sid 2025-05-29 14:02:49 -04:00
Mark Backman
803b3f2cc4 fix: update websocket_client to use FrameProcessorSetup 2025-05-29 14:02:47 -04:00
Aleix Conchillo Flaqué
1788ba6c5c add new pipecat.examples package and make runner public 2025-05-29 10:43:12 -07:00
Philipp Eisen
5209bd3d9f Fix implemetation of ttfb_ms metric 2025-05-29 19:24:04 +02:00
Aleix Conchillo Flaqué
cb9178f1ec Merge pull request #1914 from pipecat-ai/aleix/base-output-daily-handle-dtmf-frames
output transport can now handle DTMF keypress
2025-05-29 09:35:01 -07:00
Aleix Conchillo Flaqué
5676920a6a BaseOutputTransport/DailyTransport: allow sending DTMF keypress 2025-05-29 09:33:29 -07:00
Aleix Conchillo Flaqué
513221d9fd added OutputDTMFUrgentFrame 2025-05-29 09:32:57 -07:00
Mark Backman
a33d0b4b53 Merge pull request #1904 from pipecat-ai/mb/aws-bedrock-no-op-tool
Add no_op tool to AWSBedrockLLMService
2025-05-29 10:29:19 -04:00
Mark Backman
bee242b781 Safer check when using the no_operation tool 2025-05-29 10:25:20 -04:00
Mark Backman
fa1c98ff29 Add no_op tool to AWSBedrockLLMService 2025-05-29 10:25:19 -04:00
Mark Backman
ae3a7d9bed Merge pull request #1920 from alexflorensa/alexflorensa/set-deepgram-model-name
fix(deepgram-stt): set model name to Deepgram STT
2025-05-29 09:41:56 -04:00
Matej Marinko
ee5fea4221 Fix auto finalization cycle 2025-05-29 14:58:35 +02:00
Matej Marinko
db7b60cfe9 Auto finalize fix 2025-05-29 13:24:53 +02:00
alexflorensa
0c2efb312c fix(deepgram-stt) Set model name to Deepgram STT 2025-05-29 12:08:16 +02:00
Matej Marinko
51b79bd6a1 Minor code style changes 2025-05-29 10:11:11 +02:00
Matej Marinko
95fe762776 Fix typo 2025-05-29 09:23:37 +02:00
Aleix Conchillo Flaqué
cf8eeaab0b Merge pull request #1909 from pipecat-ai/aleix/daily-expose-transcription-functions
DailyTransport: expose start_transcription/stop_transcription
2025-05-28 17:06:40 -07:00
Vanessa Pyne
2f8cb3ce76 Merge pull request #1804 from pipecat-ai/vp-text-webrtc-chatbot
add text chatbot example using small webrtc transport
2025-05-28 14:25:42 -05:00
vipyne
821da723c0 update SmallWebRTCTransport text examples with new run_example 2025-05-28 13:54:45 -05:00
Filipi Fuchter
575b97ba60 Some improvements and cleanups in the SmallWebRTCTransport text examples. 2025-05-28 13:54:45 -05:00
vipyne
cc0819b709 examples: add text and audio over webrtc transport
update filenames

add high level comments to 41* examples
2025-05-28 13:54:45 -05:00
vipyne
318d6f042b examples: add text chatbot example using small webrtc transport
examples: send small webrtc UI updates in RTVIServerMessageFrame

add explanation about RTVI server messages being specific to
small-webrtc-prebuilt UI
2025-05-28 13:54:45 -05:00
Aleix Conchillo Flaqué
05ae3a1703 DailyTransport: expose start_transcription/stop_transcription 2025-05-28 11:21:26 -07:00
Aleix Conchillo Flaqué
8e54805e62 Merge pull request #1908 from pipecat-ai/aleix/add-manifest-file-to-reduce-sdist
add MANIFEST.in to reduce sdist tarball size
2025-05-28 10:15:38 -07:00
Aleix Conchillo Flaqué
64399a72f3 add MANIFEST.in to reduce sdist tarball size 2025-05-28 10:09:39 -07:00
Aleix Conchillo Flaqué
6c33f0b0bd Merge pull request #1907 from pipecat-ai/aleix/pipecat-0.0.68
update CHANGELOG for pipecat 0.0.68
2025-05-28 09:41:34 -07:00
Aleix Conchillo Flaqué
aca304b395 update CHANGELOG for pipecat 0.0.68 2025-05-28 09:08:08 -07:00
Aleix Conchillo Flaqué
db5c9e67be Merge pull request #1906 from pipecat-ai/aleix/daily-python-0.19.1
pyproject: update daily-python to 0.19.1
2025-05-28 09:05:13 -07:00
Aleix Conchillo Flaqué
2313cec792 pyproject: update daily-python to 0.19.1 2025-05-28 00:36:40 -07:00
Matej Marinko
2968c846ce Add Soniox STT service 2025-05-28 09:35:21 +02:00
Aleix Conchillo Flaqué
83acaf692a Merge pull request #1890 from pipecat-ai/aleix/examples-multi-transport
add support for multiple transports to foundational examples
2025-05-28 00:27:01 -07:00
Aleix Conchillo Flaqué
e9aeb2662b scripts: allow specifying a name for the test run 2025-05-28 00:22:55 -07:00
Aleix Conchillo Flaqué
356f4039e4 scripts: allow storing logs for release evals 2025-05-27 21:10:22 -07:00
Aleix Conchillo Flaqué
736c7f1f30 scripts: allow storing audio for release evals 2025-05-27 18:09:25 -07:00
Aleix Conchillo Flaqué
2994448036 introduce release evals
This is an initial attempt to implement evals for all (or most) of our
foundational examples. Before we release, we want to make sure all of them work
and reply properly. Until now this has been done manually, hopefully this will
be useful to speed up our release process.
2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
d476d9ea05 examples: remove "on_client_closed"
This has been replaced for "on_client_disconnected" in SmallWebRTCTransport to
match other transports and therefore it is not necessary anymore.
2025-05-27 17:42:52 -07:00
Filipi Fuchter
bf31bce440 Updated SmallWebRTCTransport to align with how other transports handle on_client_disconnected 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
6393e89022 examples(foundational): update handle_signint depending on transport 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
884268fce3 examples(foundational): allow running examples with twilio 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
071a9307c9 transport(websocket): do not require a frame serializer 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
2cdfaa0a82 examples(foundational): support multiple transports 2025-05-27 17:42:52 -07:00
Aleix Conchillo Flaqué
ecf878e14d DailyTransport: allow requesting video frames with any framerate 2025-05-27 17:42:50 -07:00
Aleix Conchillo Flaqué
4eed335bc7 PipelineTask: check if pipeline has already been cancelled 2025-05-27 17:40:39 -07:00
Aleix Conchillo Flaqué
2e57bb74d2 BaseObject: do not raise exception if event handler not registered 2025-05-27 17:40:35 -07:00
Aleix Conchillo Flaqué
0a39769cd0 Merge pull request #1901 from pipecat-ai/aleix/deepgram-sdk-4.1.0
pyproject: update deepgram-sdk to 4.1.0
2025-05-27 17:20:54 -07:00
Aleix Conchillo Flaqué
bdb6a9e5d1 Merge pull request #1903 from pipecat-ai/aleix/openpipe-4.50.0
pyproject: update openpipe to 4.50.0
2025-05-27 17:20:30 -07:00
Aleix Conchillo Flaqué
f88e0eb96d pyproject: update openpipe to 4.50.0 2025-05-27 15:22:55 -07:00
Aleix Conchillo Flaqué
0099f60d29 deepgram: fix an issue with user provided LiveOptions 2025-05-27 15:19:17 -07:00
Filipi da Silva Fuchter
eaf9f20c56 Merge pull request #1898 from pipecat-ai/tavus_transport_fixes
Fixing TavusTransport with some TTS services.
2025-05-27 16:33:19 -03:00
Aleix Conchillo Flaqué
e987c4741a pyproject: update deepgram-sdk to 4.1.0 2025-05-27 10:32:48 -07:00
Aleix Conchillo Flaqué
6242278abd Merge pull request #1895 from pipecat-ai/aleix/more-avoid-mutable-default-values
more avoiding mutable default constructor values
2025-05-27 10:08:42 -07:00
Mark Backman
30850a431a Merge pull request #1886 from pipecat-ai/mb/add-otel-llm-output
Add LLM response tracing to OTel tracing
2025-05-27 11:58:57 -04:00
Mark Backman
1e7407c042 Merge pull request #1899 from pipecat-ai/mb/genai-to-standard-messages
fix: Update GoogleLLMContext to_standard_messages to be compatible with google-genai
2025-05-27 11:57:44 -04:00
Mark Backman
6d94f31ff2 Update foundational example 33 to work with google-genai 2025-05-27 11:20:21 -04:00
Mark Backman
ebb3d1cfd3 fix: Update GoogleLLMContext to_standard_messages to be compatible with google-genai 2025-05-27 11:20:21 -04:00
Filipi Fuchter
acce9489d7 Creating the silence based on the chunk size. 2025-05-27 11:26:34 -03:00
Filipi Fuchter
3d442620f9 Removing not used imports. 2025-05-27 10:42:06 -03:00
Martin Schweiger
f1d7eb8565 final touches 2025-05-27 21:29:50 +08:00
Filipi Fuchter
798b935ff6 The remaining audio should not be sent as done. 2025-05-27 10:28:40 -03:00
Filipi Fuchter
3039a1444e Refactoring the TavusVideoService to match the same the behavior of the bot started speaking and bot stopped speaking. 2025-05-27 10:26:41 -03:00
Mark Backman
aa7d15beb3 fix: move LLM call outside tracing try block to prevent double execution 2025-05-26 22:06:31 -04:00
Filipi Fuchter
2b3d2cb342 Fixing the issue when using the TavusVideoService with some TTS services. 2025-05-26 22:55:20 -03:00
Filipi Fuchter
5a58357429 Fixing the issue when using the TavusTransport with some TTS services. 2025-05-26 18:34:52 -03:00
Mark Backman
366add2536 Merge pull request #1878 from pipecat-ai/mb/add-plivo-serializer
Add PlivoFrameSerializer
2025-05-26 11:07:13 -04:00
Mark Backman
e13c9fd42e Add PlivoFrameSerializer 2025-05-26 11:00:01 -04:00
Mark Backman
2a6c01f634 Merge pull request #1885 from pipecat-ai/mb/update-langfuse-example 2025-05-26 10:14:02 -04:00
Mark Backman
bf29722e78 Merge pull request #1884 from pipecat-ai/mb/readme-otel-twilio-telnyx 2025-05-26 10:13:23 -04:00
Mark Backman
db227ad15f Merge pull request #1897 from pipecat-ai/mb/fix-websocket-client-example
Fix mismatched html tag in websocket client example
2025-05-26 09:26:53 -04:00
Mark Backman
514716042b Fix mismatched html tag in websocket client example 2025-05-26 08:25:03 -04:00
Aleix Conchillo Flaqué
7a767e680c more avoiding mutable default constructor values 2025-05-25 21:00:35 -07:00
Martin Schweiger
320b52eb1e Merge branch 'm-ods/assemblyai-universal-streaming' of https://github.com/m-ods/pipecat into m-ods/assemblyai-universal-streaming 2025-05-26 09:13:08 +08:00
Martin Schweiger
428cee75c5 Add User-Agent header to AssemblyAI websocket connection 2025-05-26 09:10:55 +08:00
Martin Schweiger
5479a55b2c Add websockets dependency to assemblyai extra 2025-05-26 09:08:56 +08:00
Mark Backman
d1f2a5d04f Merge pull request #1868 from aristid/google-streaming-tts
Add Google streaming TTS as a base TTS service
2025-05-24 12:42:47 -04:00
aristid
09ba319f3e Merge branch 'main' into google-streaming-tts 2025-05-24 17:16:22 +02:00
ezun-kim
3da711ba8b Fix SSE server connection handling for MCP client
### Summary
This PR improves the MCP (Model Context Protocol) client's SSE (Server-Sent Events) server connection handling by replacing the generic string parameter with a proper `SseServerParameters` class.

### Changes
- **Breaking Change**: Changed `server_params` type from `Union[StdioServerParameters, str]` to `Union[StdioServerParameters, SseServerParameters]`
- Added import for `SseServerParameters` from `mcp.client.session_group`
- Updated SSE client connection to use structured parameters instead of a simple URL string
- Fixed error message to correctly reflect the expected parameter types
- Improved logging by changing info-level log to debug-level for consistency

### Details

#### Before
The SSE client connection only accepted a URL string:
```python
async with self._client(self._server_params) as (read, write):
```

#### After
Now properly unpacks SSE server parameters:
```python
async with self._client(
    url=self._server_params.url,
    headers=self._server_params.headers,
    timeout=self._server_params.timeout,
    sse_read_timeout=self._server_params.sse_read_timeout
) as (read, write):
```

### Benefits
- **Type Safety**: Stronger type checking with dedicated `SseServerParameters` class
- **Extended Configuration**: Support for custom headers (authentication), timeouts, and SSE-specific settings
- **Better Error Messages**: Clear type error messages when incorrect parameters are provided
- **Improved Debugging**: Debug logging of SSE server parameters for troubleshooting

### Migration Guide
Users need to update their SSE server initialization:
```python
# Before
client = MCPClient("https://example.com/sse")

# After
from mcp.client.session_group import SseServerParameters
client = MCPClient(SseServerParameters(
    url="https://example.com/sse",
    headers={"Authorization": "Bearer token"},
    timeout=30,
    sse_read_timeout=60
))
```

### Testing
- [ ] Tested with StdioServerParameters (unchanged behavior)
- [ ] Tested with SseServerParameters with various configurations
- [ ] Verified error handling for invalid parameter types

---

This is a necessary change to support production-ready SSE connections with proper authentication and timeout handling.
2025-05-24 22:35:57 +09:00
Mark Backman
6f524fb816 Add LLM response to OTel tracing 2025-05-24 09:15:39 -04:00
unknown
d3e2a9e5c0 Change default voice and fix formatting 2025-05-24 15:14:39 +02:00
Mark Backman
b4cd7d7941 Langfuse OTel env.example improvements 2025-05-24 08:17:10 -04:00
Mark Backman
cd03b91115 Update README: Add OTel, Add serializers 2025-05-24 07:23:59 -04:00
Aleix Conchillo Flaqué
f86d002ceb Merge pull request #1881 from pipecat-ai/aleix/daily-input-audio-and-video-task-fix
daily input audio and video task fix
2025-05-23 19:39:25 -07:00
Aleix Conchillo Flaqué
940926b5ec TavusVideoService: no need to enable audio/video outputs 2025-05-23 19:29:34 -07:00
Aleix Conchillo Flaqué
85c096df0b DailyTransport: create audio/video input tasks when input flag is enabled 2025-05-23 19:28:18 -07:00
Filipi da Silva Fuchter
76d93522ac Merge pull request #1820 from pipecat-ai/tavus_video_service
Tavus improvements
2025-05-23 23:11:00 -03:00
Filipi Fuchter
31492831cc Updating the changelog and readme to reflect the Tavus changes. 2025-05-23 23:04:04 -03:00
Filipi Fuchter
8221dd594e Creating a Tavus example using the DailyTransport. 2025-05-23 23:03:40 -03:00
Filipi Fuchter
6346ca1a84 Creating a Tavus example using the SmallWebRTCTransport. 2025-05-23 23:03:24 -03:00
Filipi Fuchter
4a3404883f Creating a Tavus example using the new TavusTransport. 2025-05-23 23:03:16 -03:00
Filipi Fuchter
1ebca35313 Queuing the app messages if the SmallWebrtcTransport is not connected yet. 2025-05-23 23:03:04 -03:00
Filipi Fuchter
e0d1381f87 Refactoring the TavusVideoService to allow to work with any transport. 2025-05-23 23:02:49 -03:00
Filipi Fuchter
86e6841569 Creating TavusTransport and TavusTransportClient. 2025-05-23 23:02:37 -03:00
Aleix Conchillo Flaqué
28b7a92a00 Merge pull request #1880 from pipecat-ai/aleix/daily-resize-event-loop-fix
BaseOutputTransport: don't block event loop during image resize
2025-05-23 18:32:00 -07:00
Aleix Conchillo Flaqué
4db5b18694 BaseOutputTransport: don't block event loop during image resize 2025-05-23 18:30:28 -07:00
Aleix Conchillo Flaqué
a628e921c0 Merge pull request #1879 from pipecat-ai/aleix/daily-fix-video-task
DailyTransport: fix video task variable
2025-05-23 17:56:08 -07:00
Aleix Conchillo Flaqué
6ca6ff37c9 DailyTransport: fix video task variable 2025-05-23 17:54:25 -07:00
Aleix Conchillo Flaqué
456db3710a Merge pull request #1828 from pipecat-ai/aleix/daily-use-audio-renderers
DailyTransport: replace virtual speaker and microphones
2025-05-23 13:31:51 -07:00
Aleix Conchillo Flaqué
50f024c6f9 LiveKitTransport: use UserAudioRawFrame instead of InputAudioRawFrame 2025-05-23 11:27:53 -07:00
Aleix Conchillo Flaqué
a4de75a8c0 Merge pull request #1867 from pipecat-ai/aleix/user-bot-latency-log-observer
observers: added UserBotLatencyLogObserver
2025-05-23 09:23:03 -07:00
Aleix Conchillo Flaqué
88e8fcdaca observers: added UserBotLatencyLogObserver 2025-05-23 09:17:53 -07:00
unknown
bfe9952c9a Remove sleep(0), add doc string etc. 2025-05-23 12:11:08 +02:00
Martin Schweiger
7f568e3e7e Merge branch 'main' into m-ods/assemblyai-universal-streaming 2025-05-23 17:39:00 +08:00
Martin Schweiger
9b8800ac1d update stt.py 2025-05-23 17:32:31 +08:00
Martin Schweiger
fd53712567 add models for new streaming service 2025-05-23 17:32:12 +08:00
Aleix Conchillo Flaqué
7f74c2465c SileroVADAnalyzer: improve non-matching sample rate log 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
30d67a78eb examples(chatbot-audio-recording): use same sample rate to avoid downsampling 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
c3cfd1f0ce DailyTransport: process audio, video and event callbacks in separate tasks 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
69ac70eed8 DailyTransport: replace virtual microphone with custom microphone track 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
fcf49e79cc DailyTransport: use participant audio renderers instead of virtual speaker 2025-05-23 01:47:09 -07:00
Aleix Conchillo Flaqué
8d4894846d pyproject: update to daily-python 0.19.0 2025-05-23 01:47:09 -07:00
Nischal Jain
ffa16dd136 added deepwiki badge for weekly repo refresh 2025-05-22 20:08:48 -07:00
Vanessa Pyne
a809b710c5 Merge pull request #1844 from pipecat-ai/vp-docsinreadme
add docs link at top of readme
2025-05-22 21:52:18 -05:00
vipyne
f6289e9db2 add docs link at top of readme 2025-05-22 21:51:29 -05:00
Mark Backman
26b4c4df22 Merge pull request #1870 from pipecat-ai/mb/gemini-2.5-flash-update
Update GeminiMultimodalLiveLLMService to use Gemini 2.5 Flash Native …
2025-05-22 18:19:55 -04:00
Mark Backman
f3a9844295 Merge pull request #1860 from pipecat-ai/mb/organize-otel-demos
Reorganize OpenTelemetry demos, add top-level README
2025-05-22 18:15:20 -04:00
Mark Backman
692821bdae Merge pull request #1873 from pipecat-ai/mb/readme-sarvam
Add SarvamTTSService to README
2025-05-22 18:14:40 -04:00
Mark Backman
ee143d5b3a Update GeminiMultimodalLiveLLMService to use Gemini 2.5 Flash Native Audio Dialog model 2025-05-22 18:13:41 -04:00
Mark Backman
7e178a634a Merge pull request #1871 from pipecat-ai/mb/claude-sonnet-4-update
Update default model for Anthropic to Claude Sonnet 4
2025-05-22 18:12:47 -04:00
Mark Backman
fe88a3d80b Add SarvamTTSService to README 2025-05-22 18:11:11 -04:00
Mark Backman
a196eac290 Merge pull request #1872 from pipecat-ai/mb/add-sarvam-tts
Add SarvamTTSService
2025-05-22 18:02:36 -04:00
Mark Backman
3c819955a2 Add SarvamTTSService 2025-05-22 16:23:08 -04:00
Mark Backman
ca0d7bbbed Update default model for Anthropic to Claude Sonnet 4 2025-05-22 15:13:33 -04:00
Mark Backman
f93bd1e817 Merge pull request #1864 from pipecat-ai/mb/fix-11lab-set-model-voice
Fix: ElevenLabsTTSService, change voice and model
2025-05-22 14:36:24 -04:00
Mark Backman
415bc6ca0a Fix: ElevenLabsTTSService, change voice and model 2025-05-22 14:28:50 -04:00
Mark Backman
8543c8d11d Merge pull request #1869 from pipecat-ai/mb/update-readme-nova-sonic
Add AWS Nova Sonic to README
2025-05-22 14:07:35 -04:00
Mark Backman
bf5ad64575 Add AWS Nova Sonic to README 2025-05-22 14:03:28 -04:00
unknown
d42d02d809 Add Google streaming TTS as a base TTS service. Rename non-streaming service to GoogleHttpTTSService. 2025-05-22 11:21:06 +02:00
Aleix Conchillo Flaqué
0718f79ff2 Merge pull request #1866 from pipecat-ai/aleix/base-observers-are-base-objects
BaseObserver: inherit from BaseObject so we can have events
2025-05-21 16:07:38 -07:00
Aleix Conchillo Flaqué
9bbce225ce BaseObserver: inherit from BaseObject so we can have events 2025-05-21 16:04:44 -07:00
Mark Backman
fb35fd6d71 Merge pull request #1859 from pipecat-ai/mb/otel-attribute-naming
Update OTel attribute names
2025-05-21 12:10:15 -04:00
Mark Backman
b4fd92aed6 Merge pull request #1862 from marctorsoc/clean-links-in-md-text-filter
Add link cleaning in MD text filter
2025-05-21 09:20:27 -04:00
Mark Backman
36931825b3 Merge pull request #1854 from sklinglernv/fix/elevenlab-tts
fix(elevenlabs tts): message parameter naming
2025-05-21 09:17:29 -04:00
marc.torsoc
ca35299dcd add link cleaning and a test for it 2025-05-21 12:08:53 +02:00
Severin Klingler
e74b900914 revert most of the changes except keyword naming fix 2025-05-21 09:24:03 +02:00
Mark Backman
25115668a7 Reorganize OpenTelemetry demos, add top-level README 2025-05-20 23:30:46 -04:00
Mark Backman
fb94db3e64 Update to use GenAI naming 2025-05-20 22:56:02 -04:00
Mark Backman
c4778e770e Merge pull request #1835 from marcklingen/langfuse-tracing
Add examples/open-telemetry-tracing-langfuse
2025-05-20 18:22:55 -04:00
Mark Backman
3860cdf97b Update OTel attribute names 2025-05-20 18:00:46 -04:00
Aleix Conchillo Flaqué
f3aec0c4ac Merge pull request #1829 from pipecat-ai/aleix/pipeline-task-add-observer
PipelineTask: add add_observer()
2025-05-20 13:18:24 -07:00
Aleix Conchillo Flaqué
d333094149 PipelineTask: add add_observer() and remove_observer() 2025-05-20 13:16:06 -07:00
Aleix Conchillo Flaqué
609ff4e66c Merge pull request #1841 from pipecat-ai/aleix/base-text-aggregator-async
make BaseTextAggregator and BaseTextFilter functions async
2025-05-20 13:13:54 -07:00
Aleix Conchillo Flaqué
cbccbcd9e7 BaseTextFilter: make functions async 2025-05-20 13:11:44 -07:00
Aleix Conchillo Flaqué
54b1d7fcc1 BaseTextAggregator: make functions async 2025-05-20 13:11:42 -07:00
Aleix Conchillo Flaqué
54388c0d9b Merge pull request #1850 from pipecat-ai/aleix/transcription-message-user-id
TranscriptionMessage: add user_id field
2025-05-20 13:10:42 -07:00
Aleix Conchillo Flaqué
228c866aaa Merge pull request #1857 from pipecat-ai/aleix/avoid-mutable-default-values
avoid mutable default constructor values
2025-05-20 13:10:24 -07:00
Aleix Conchillo Flaqué
a09bd648af avoid mutable default constructor values 2025-05-20 11:59:28 -07:00
Vanessa Pyne
3e4ae61c75 Merge pull request #1856 from pipecat-ai/vp-mcp-debug
mcp: fix typo in tool call response
2025-05-20 13:59:11 -05:00
vipyne
7655c432c2 mcp: fix typo in tool call response 2025-05-20 11:16:59 -05:00
Aleix Conchillo Flaqué
25dd651757 TranscriptionMessage: add user_id field 2025-05-19 15:47:54 -07:00
Mark Backman
462aecea3e Merge pull request #1839 from pipecat-ai/mb/cartesia-speed
Add support for Cartesia's speed parameter, update clients and APIs, deprecate emotion
2025-05-19 17:57:25 -04:00
Mark Backman
5f37df790b Merge pull request #1848 from pipecat-ai/mb/fix-word-wrangler-transport-params
Fix: Add audio_in_enabled to Word Wrangler TransportParams
2025-05-19 17:52:05 -04:00
Mark Backman
8e4e03541c Update CHANGELOG 2025-05-19 17:51:27 -04:00
Aleix Conchillo Flaqué
c1252fc7eb Merge pull request #1840 from pipecat-ai/aleix/base-object-dont-create-tasks
BaseObject: don't create event handler tasks if none is registered
2025-05-19 14:12:31 -07:00
Mark Backman
ed1077cc9a Fix: Add audio_in_enabled to Word Wrangler TransportParams 2025-05-19 15:53:29 -04:00
Mark Backman
4c761a7b22 Merge pull request #1847 from pipecat-ai/mb/update-otel
Keep span identifiers in attributes only
2025-05-19 14:37:42 -04:00
Mark Backman
9bc3df7803 Keep span identifiers in attributes only 2025-05-19 12:25:13 -04:00
Aleix Conchillo Flaqué
5e5060a6fe BaseObject: don't create event handler tasks if none is registered 2025-05-19 09:24:56 -07:00
Aleix Conchillo Flaqué
2b66eddaa1 Merge pull request #1830 from pipecat-ai/aleix/pipeline-task-frame-events
PipelineTask: add new started/stopped/ended/cancelled events
2025-05-19 08:32:28 -07:00
Mark Backman
916b9d6c6d Add an example for CartesiaHttpTTSService 2025-05-19 11:31:47 -04:00
Mark Backman
bd09ccd608 Update CartesiaHttpTTSService to work with the new cartesia 2.0 client 2025-05-19 11:31:28 -04:00
Mark Backman
682f8e4d45 Bump the cartesia_version for CartesiaTTSService, and cartesia package for CartesiaHttpTTSService 2025-05-19 11:10:03 -04:00
Mark Backman
c9d0af9ee0 Deprecate emotion, add new speed parameter 2025-05-19 09:43:24 -04:00
Severin Klingler
e1299d59bf fix(elevenlabs tts): Fix message paramter naming and make use of contexts to send out TTSStoppedFrames() 2025-05-19 15:22:13 +02:00
Mark Backman
61da6437ea Merge pull request #1834 from pipecat-ai/mb/gemini-live-tokens
Fix: Make LLMTokenUsage more robust
2025-05-19 09:04:07 -04:00
Marc Klingen
798705469b Update README.md 2025-05-18 21:11:20 +02:00
Marc Klingen
459a753de3 add reference to main otel example 2025-05-18 19:56:12 +02:00
Marc Klingen
1092ce70b3 add video of langfuse trace 2025-05-18 19:46:38 +02:00
Marc Klingen
9511c189bd revert original folder 2025-05-18 19:42:13 +02:00
Marc Klingen
66fea9e2ee create new example folder 2025-05-18 19:41:17 +02:00
Marc Klingen
69ae83516e use http exporter 2025-05-18 19:11:06 +02:00
Mark Backman
144ea36c81 Fix: Make LLMTokenUsage more robust 2025-05-18 07:41:16 -04:00
Mark Backman
7a8ab9a900 Merge pull request #1672 from golbin/main
Use "use_original_timestamps" only for sonic-2 model
2025-05-18 07:24:58 -04:00
Jin Kim
c4b35055b4 Update CHANGELOG.md 2025-05-18 16:54:29 +09:00
Jin Kim
a4c04e7c17 Opt out Sonic models from use_original_timestamps 2025-05-18 16:52:37 +09:00
Jin Kim
a6f7e7fc30 Merge branch 'pipecat-ai:main' into main 2025-05-18 16:48:24 +09:00
Aleix Conchillo Flaqué
d5ebc883b3 PipelineTask: add new started/stopped/ended/cancelled events 2025-05-17 22:46:22 -07:00
Mark Backman
deb43df0a4 Merge pull request #1824 from pipecat-ai/mb/gemini-live-transcribe-user-audio
Update GeminiMultimodalLiveLLMService to use Gemini's user transcription
2025-05-16 22:51:04 -04:00
Mattie Ruth
88e472b3f1 Update Modal Readme (#1825) 2025-05-16 17:40:57 -04:00
Mark Backman
f59fb8167d Merge pull request #1784 from thsunkid/thu/handle-transcript-gpt4o-audio
Handle audio transcript from gpt-4o-audio and clean up logs
2025-05-16 13:20:16 -04:00
Mark Backman
fac6f526f7 Add comments and docstrings 2025-05-16 10:54:50 -04:00
Mark Backman
2f78d74ce6 Change Gemini Live to use Gemini provided usage metrics 2025-05-16 10:53:01 -04:00
Mark Backman
d3942dda52 Gemini Live to transcribe user audio 2025-05-16 10:53:01 -04:00
Mark Backman
c00e9a8d3a Merge pull request #1819 from kaikato/lmnt-model-langs
LmntTTSService: add model param and additional languages
2025-05-16 08:49:55 -04:00
kaikato
c3b95767f3 LmntTTSService: add model param and additional languages 2025-05-16 04:24:57 +00:00
Mark Backman
90f27a3090 Merge pull request #1816 from pipecat-ai/mb/add-minimax-tts
Add MiniMax TTS
2025-05-15 18:05:13 -04:00
Mark Backman
b6f09defc9 Add MiniMax TTS 2025-05-15 18:02:29 -04:00
Aleix Conchillo Flaqué
172813bcfb Merge pull request #1815 from pipecat-ai/aleix/remove-silerovad-processor
remove SileroVAD() frame processor
2025-05-15 13:44:44 -07:00
Aleix Conchillo Flaqué
95c25efab7 remove SileroVAD() frame processor 2025-05-15 11:55:20 -07:00
Aleix Conchillo Flaqué
a51af35024 Merge pull request #1814 from pipecat-ai/aleix/examples-dependabot-05142025
examples: updates for dependabot 05/14/2025
2025-05-15 11:38:45 -07:00
Mark Backman
119fd5ba7d Merge pull request #1025 from fatwang2/main
added hailuo tts service
2025-05-15 14:29:24 -04:00
Aleix Conchillo Flaqué
0718a812bd examples: updates for dependabot 05/14/2025 2025-05-14 22:51:08 -07:00
Mark Backman
3814501b48 Merge pull request #1811 from pipecat-ai/mb/dont-require-tracing-dep
Fix: Resolve an issue where tracing imports were required
2025-05-14 12:35:47 -04:00
Mark Backman
7a5205dbda Fix: Resolve an issue where tracing imports were required 2025-05-14 12:29:08 -04:00
Thu Nguyen
15a5028d23 Revert log changes 2025-05-14 22:28:25 +08:00
Thu Nguyen
fee2648ac0 Handle audio transcript from gpt-4o-audio and clean up logs 2025-05-14 13:02:22 +07:00
Varun Singh
04c02c9a20 Merge pull request #1810 from pipecat-ai/vr000m-receiving-custom-sip-headers
added handling for sipHeaders
2025-05-13 23:02:14 -07:00
getchannel
e27da96cdc Rename file_api to file_api.py
added proper .py to file name.
2025-05-13 22:01:02 -04:00
Varun Singh
0ff7195a83 Update README.md
updating docs
2025-05-13 19:08:43 -04:00
Varun Singh
3b91aa013a added handling for sipHeaders 2025-05-13 16:00:05 -07:00
Mark Backman
50f6235edb Add support for OpenTelemetry tracing (#1729)
* Also added TurnTrackingObserver, TurnTraceObserver, foundational 29, open-telemetry-example
2025-05-13 17:18:11 -04:00
Aleix Conchillo Flaqué
6f4d94f91b Merge pull request #1800 from pipecat-ai/aleix/frame-processors-setup
introduce frame processors setup
2025-05-13 13:18:06 -07:00
Aleix Conchillo Flaqué
83a4c7d443 RTVIProcessor: remove unused code 2025-05-13 11:26:37 -07:00
Aleix Conchillo Flaqué
8171fec925 SmallWebRTCConnection: complain if av package not found 2025-05-13 11:26:37 -07:00
Aleix Conchillo Flaqué
175f352ea7 add FrameProcessor.setup() to setup processors before StartFrame 2025-05-13 11:26:35 -07:00
Filipi da Silva Fuchter
5290161ac4 Merge pull request #1746 from pipecat-ai/simple_chatbot-react-native
Simple chatbot: React Native client
2025-05-13 10:48:09 -03:00
Filipi Fuchter
8762019ed7 Not setting the local audio level when the user stopped speaking. 2025-05-13 10:46:30 -03:00
Filipi Fuchter
61a59fa158 Fixing useNavigation typescript warning. 2025-05-13 10:36:39 -03:00
Filipi Fuchter
55eea20c8e Renaming expo environment variable 2025-05-13 10:32:27 -03:00
kompfner
9a621f0c54 Merge pull request #1805 from pipecat-ai/pk/aws-nova-sonic-aggregate-user-transcription-text
AWS Nova Sonic service - aggregate user transcription text; it was fr…
2025-05-13 09:13:58 -04:00
Paul Kompfner
55fc24e933 AWS Nova Sonic service - aggregate user transcription text; it was fragmented across many conversation history messages before 2025-05-13 09:13:28 -04:00
Filipi da Silva Fuchter
b14608f09b Merge pull request #1799 from pipecat-ai/daily_audio_source
Using audio source for capturing Daily's participant audio
2025-05-13 08:15:10 -03:00
Mark Backman
4a25c57337 Merge pull request #1806 from pipecat-ai/aleix/run-test-observers
tests: allow passing observers to run_test()
2025-05-12 22:10:44 -04:00
Aleix Conchillo Flaqué
f800e35ccb tests: allow passing observers to run_test() 2025-05-12 17:53:02 -07:00
Vanessa Pyne
12d49a9b9d Merge pull request #1801 from pipecat-ai/vp-fix-typo
update examples
2025-05-12 15:33:56 -05:00
vipyne
b25b251a44 update examples 2025-05-12 14:07:17 -05:00
Mattie Ruth
64b2a75a94 Update Modal App: (#1755)
* Update Modal App:

Updated Modal App to include:

1. Latest Modal API usage
2. Ability to launch different Pipecat pipelines, much like the
   simple chatbot example
3. Ability to choose which pipeline is launched via the
   /connect endpoint
4. Added a pipeline option for connecting to a self-hosted LLM
   on Modal
5. Improved READMEs
6. Added a web client for interacting with the Modal deployment

tmp

* Update README
2025-05-12 12:45:43 -05:00
Aleix Conchillo Flaqué
b33a60f3a5 Merge pull request #1793 from pipecat-ai/khk/deepgram-async-fix
Fix Deepgram TTS streaming
2025-05-12 09:59:46 -07:00
Filipi Fuchter
d22dbb1a6d Fixing ruff format. 2025-05-12 10:36:21 -03:00
Filipi Fuchter
983199a6cd New example capturing the audio from the participant using the custom audio source. 2025-05-12 10:18:43 -03:00
Filipi Fuchter
133d7ee33a Fixing the default audio source for capture_participant_audio 2025-05-12 10:16:32 -03:00
Mark Backman
0bd888afc7 Merge pull request #1796 from nikp06/patch-1
Wrong deprecation warning when importing ai_services.py
2025-05-12 09:12:48 -04:00
nikp06
537bd1c58d Update ai_services.py
fix: correct deprecation warning format in ai_services module
2025-05-12 12:01:13 +02:00
Kwindla Hultman Kramer
5ef519fe2c Fix Deepgram TTS to use stream_raw() 2025-05-11 15:40:31 -07:00
Mark Backman
20498fb47f Merge pull request #1790 from AngeloGiacco/angelo/fix-api-key
[elevenlabs tts ] fix api key
2025-05-10 19:16:27 -04:00
Angelo Giacco
b57dfb3b5d fix lint 2025-05-10 16:36:26 +01:00
Angelo Giacco
0355ed4aa1 move api key to ws header 2025-05-10 16:34:01 +01:00
Angelo Giacco
1e76cc7bdc fix: elevenlabs api key 2025-05-10 16:09:20 +01:00
Vanessa Pyne
18c0374126 Merge pull request #1785 from pipecat-ai/vp-small-filenmae-change
39-aws-nova-sonic.py -> 40-aws-nova-sonic.py
2025-05-09 12:19:09 -05:00
Aleix Conchillo Flaqué
7072fba7e7 Merge pull request #1780 from pipecat-ai/aleix/deprecate-google-generativeai
GoogleLLMService: deprecate google-generativeai
2025-05-09 09:18:30 -07:00
Aleix Conchillo Flaqué
3d702a5c39 minor examples cleanup 2025-05-09 09:16:10 -07:00
Aleix Conchillo Flaqué
f31efa42c9 GoogleLLMService: deprecate google-generativeai 2025-05-09 09:14:43 -07:00
getchannel
d86502e79a add file_api __init__.py 2025-05-09 10:53:31 -04:00
getchannel
59c7744590 add FileData class events.py 2025-05-09 10:52:04 -04:00
getchannel
949971dea9 Create file_api 2025-05-09 10:51:24 -04:00
getchannel
cd4a893c65 add FileAPI to gemini.py 2025-05-09 10:50:27 -04:00
vipyne
74b369ff20 39-aws-nova-sonic.py -> 40-aws-nova-sonic.py 2025-05-09 08:30:59 -05:00
Filipi Fuchter
46eed0a59a Bumping to use the latest version of @pipecat-ai/react-native-daily-transport, and removing code not needed. 2025-05-08 18:18:00 -03:00
kompfner
9643296e29 Merge pull request #1779 from pipecat-ai/pk/aws-nova-sonic-missing-params-export
Add missing `Params` export to AWS Nova Sonic module
2025-05-08 16:04:38 -04:00
Paul Kompfner
c83c5b5a34 Add missing Params export to AWS Nova Sonic module 2025-05-08 15:23:25 -04:00
Filipi Fuchter
277e2d7fc0 Merge branch 'main' into simple_chatbot-react-native 2025-05-08 09:03:16 -03:00
Mark Backman
7280e390d9 Merge pull request #1774 from pipecat-ai/mb/moondream-ex-server
Add load_dotenv to moondream example server
2025-05-07 19:02:30 -04:00
Mark Backman
4efc3f0a39 Merge pull request #1775 from pipecat-ai/mb/patient-ex-env
Add load_dotenv to patient-intake server file
2025-05-07 19:02:20 -04:00
Mark Backman
cb7e7a8aa3 Add load_dotenv to patient-intake server file 2025-05-07 18:40:04 -04:00
Mark Backman
9136402846 Add load_dotenv to moondream example server 2025-05-07 18:29:27 -04:00
Aleix Conchillo Flaqué
260fc76137 Merge pull request #1773 from pipecat-ai/aleix/pipecat-0.0.67
update CHANGELOG for 0.0.67
2025-05-07 15:05:55 -07:00
Aleix Conchillo Flaqué
7cfb9a4d15 update CHANGELOG for 0.0.67 2025-05-07 14:59:16 -07:00
Mark Backman
2089e0c974 Merge pull request #1768 from pipecat-ai/mb/update-observers
Add DebugLogObserver
2025-05-07 17:30:49 -04:00
Mark Backman
9e0b4fe5d1 Replace list with tuple 2025-05-07 17:21:09 -04:00
Mark Backman
75ce632f84 Add DebugLogObserver 2025-05-07 17:21:08 -04:00
Mark Backman
efeb96c4e8 Remove unused imports 2025-05-07 17:20:42 -04:00
kompfner
fb5438e9c2 Merge pull request #1770 from pipecat-ai/pk/amazon-nova-sonic-interruption-reliability
AWS Nova Sonic service - make interruption handling more reliable, in…
2025-05-07 17:16:06 -04:00
Mark Backman
7da9f66e1c Merge pull request #1761 from pipecat-ai/mb/elevenlabs-context-id
Update ElevenLabsTTSService to use the new websocket API
2025-05-07 17:12:06 -04:00
Mark Backman
9e16e3d614 Update ElevenLabsTTSService to use the new websocket API 2025-05-07 17:09:52 -04:00
Paul Kompfner
84d040c6d0 AWS Nova Sonic service - make interruption handling more reliable, in terms of:
- not getting the conversation into a "stuck" state
- not losing assistant text that should've made it into the context
2025-05-07 16:34:18 -04:00
Mark Backman
f3e0beb8f1 Merge pull request #1762 from pipecat-ai/iss-1734-rtvi-function-call-breakage
Revert breaking change in RTVI protocol for function calling
2025-05-07 15:25:22 -04:00
Aleix Conchillo Flaqué
e00a1196ef Merge pull request #1767 from pipecat-ai/aleix/daily-python-0.18.2
pyproject: update daily-python to 0.18.2
2025-05-07 12:19:59 -07:00
Aleix Conchillo Flaqué
3867c0f8e7 Merge pull request #1766 from pipecat-ai/aleix/daily-fix-multiple-audio-video-sources
fix multiple audio video sources
2025-05-07 12:19:46 -07:00
Aleix Conchillo Flaqué
cdf0953722 pyproject: update daily-python to 0.18.2 2025-05-07 11:56:36 -07:00
Aleix Conchillo Flaqué
ed00f7d071 add video_source field to UserImageRequestFrame 2025-05-07 11:50:21 -07:00
Aleix Conchillo Flaqué
a3038afa02 DailyTransport: fix multiple audio/video sources 2025-05-07 11:50:00 -07:00
kompfner
f9ca0b8cc6 Merge pull request #1704 from pipecat-ai/pk/amazon-nova-sonic
Amazon Nova Sonic LLM service
2025-05-07 14:45:28 -04:00
Paul Kompfner
2920aa5af4 [WIP] AWS Nova Sonic service - pull AWS Nova Sonic support out of the aws optional dependency in pyproject.toml and into its own aws-nova-sonic optional dependency. That's because it requires Python >= 3.12, a higher version than the base project's 3.10. This change allows anyone using any of the other AWS services (including our own unit tests) to continue using the lower Python version. 2025-05-07 14:32:32 -04:00
Paul Kompfner
93c9cc4a0e [WIP] AWS Nova Sonic service - minor fix 2025-05-07 13:54:06 -04:00
Paul Kompfner
b53f9235e4 [WIP] AWS Nova Sonic service - remove unnecessary _context_available state, instead just relying on the presence of _context 2025-05-07 13:54:06 -04:00
Paul Kompfner
1491462d15 [WIP] AWS Nova Sonic service - remove _handling_bot_stopped_speaking, which no longer seems to be necessary; I'm no longer observing back-to-back BotStoppedSpeaking frames 2025-05-07 13:54:06 -04:00
Paul Kompfner
c78f779800 [WIP] AWS Nova Sonic service - log an error message if you try to use AWS Nova Sonic without the proper dependency (e.g. without having done pip install pipecat-ai[aws]) 2025-05-07 13:54:06 -04:00
Paul Kompfner
b013e375fb [WIP] AWS Nova Sonic service - simplify a bit of logic (and do the same simplification in the OpenAI Realtime service) 2025-05-07 13:54:06 -04:00
Paul Kompfner
52036138c1 [WIP] AWS Nova Sonic service - remove unnecessary (no-op) code 2025-05-07 13:54:06 -04:00
Paul Kompfner
4ba9a42861 [WIP] AWS Nova Sonic service - add more accurate typing 2025-05-07 13:54:06 -04:00
Paul Kompfner
27bff7a759 [WIP] AWS Nova Sonic service - fix comment 2025-05-07 13:54:06 -04:00
Paul Kompfner
896f8d85f7 [WIP] AWS Nova Sonic service - remove out-of-date TODO comment 2025-05-07 13:54:06 -04:00
Paul Kompfner
ed06cdd2c7 [WIP] AWS Nova Sonic service - add CHANGELOG entry 2025-05-07 13:54:02 -04:00
Paul Kompfner
8473647269 [WIP] AWS Nova Sonic service - update persistent-context example to better avoid saving "transitional", as opposed to meaningful, context messages 2025-05-07 13:52:51 -04:00
Paul Kompfner
5579145a06 [WIP] AWS Nova Sonic service - post-rebase, update examples to play nicely with recent pipecat changes 2025-05-07 13:52:51 -04:00
Paul Kompfner
35848d10b3 [WIP] AWS Nova Sonic service - remove various TODO comments 2025-05-07 13:52:51 -04:00
Paul Kompfner
c7e223e85a [WIP] AWS Nova Sonic service - remove print statements in favor of logger 2025-05-07 13:52:51 -04:00
Paul Kompfner
885b2d1d2f [WIP] AWS Nova Sonic service - make parameters configurable 2025-05-07 13:52:51 -04:00
Paul Kompfner
73020be511 [WIP] AWS Nova Sonic service - minor fix: only try to read received JSON if we have it 2025-05-07 13:52:51 -04:00
Paul Kompfner
d388c057c0 [WIP] AWS Nova Sonic service - recover from unwanted disconnection due to an error 2025-05-07 13:52:51 -04:00
Paul Kompfner
c4d0f91a7f [WIP] AWS Nova Sonic service - remove some old code that was accidentally still there, possibly sending a duplicate system instruction 2025-05-07 13:52:51 -04:00
Paul Kompfner
467233be04 [WIP] AWS Nova Sonic service - support multi-line system prompt 2025-05-07 13:52:51 -04:00
Paul Kompfner
2b02d08f4c [WIP] AWS Nova Sonic service - add comments to examples pointing out the us-east-1 is the only supported region so far 2025-05-07 13:52:51 -04:00
Paul Kompfner
9fe265ea64 [WIP] AWS Nova Sonic service - implement ability to persist and load conversations 2025-05-07 13:52:51 -04:00
Paul Kompfner
cc1f4ba81c [WIP] AWS Nova Sonic service - add a hacky way of programmatically triggering an assistant response 2025-05-07 13:52:51 -04:00
Paul Kompfner
3784bdbd27 [WIP] AWS Nova Sonic service - in our hacky direct manipulation of the context, aggregate assistant text rather than recording every chunk as a separate message 2025-05-07 13:52:51 -04:00
Paul Kompfner
4ffdc3b77c [WIP] AWS Nova Sonic service - do hacky direct manipulation of the context for now, since I can't seem to get assistant context aggregation working properly with frames, grr 2025-05-07 13:52:51 -04:00
Paul Kompfner
38c9fa681a [WIP] AWS Nova Sonic service - Protect against back-to-back BotStoppedSpeaking calls, which I've observed 2025-05-07 13:52:51 -04:00
Paul Kompfner
c477039954 [WIP] AWS Nova Sonic service - just for safety, add a short delay after BotStoppedSpeaking before sending LLMFullResponseEndFrame + TTSStoppedFrame, to give a bit of leeway for the LLM to deliver the "FINAL" text block describing what was said 2025-05-07 13:52:51 -04:00
Paul Kompfner
d6ef3d64ac [WIP] AWS Nova Sonic service - fix context problems of double-counting LLM text, and mis-categorizing user text as LLM text 2025-05-07 13:52:51 -04:00
Paul Kompfner
6938152db6 [WIP] AWS Nova Sonic service - fix comment 2025-05-07 13:52:51 -04:00
Paul Kompfner
2154db07f0 [WIP] AWS Nova Sonic service - remove unnecessary error log 2025-05-07 13:52:51 -04:00
Paul Kompfner
5e0803479e [WIP] AWS Nova Sonic service - add send_transcription_frames option 2025-05-07 13:52:51 -04:00
Paul Kompfner
3960c604a4 [WIP] AWS Nova Sonic service - fix empty assistant conversation history item in the context after tool use 2025-05-07 13:52:51 -04:00
Paul Kompfner
394648f1c9 [WIP] AWS Nova Sonic service - fix user utterances not making it into the context 2025-05-07 13:52:51 -04:00
Paul Kompfner
da5c4953d5 [WIP] AWS Nova Sonic service - allow passing in tools into initializer 2025-05-07 13:52:51 -04:00
Paul Kompfner
2b7e1cb5b1 [WIP] AWS Nova Sonic service - add tool calling 2025-05-07 13:52:51 -04:00
Paul Kompfner
f182eafb40 [WIP] AWS Nova Sonic service - add ability to pass in OpenAILLMContext 2025-05-07 13:52:51 -04:00
Paul Kompfner
9f7f42e885 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
9b8bce1914 [WIP] AWS Nova Sonic service - add voice_id 2025-05-07 13:52:51 -04:00
Paul Kompfner
96d05e12fc [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
68c1069548 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
5b64613f65 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
1f9baefba8 [WIP] AWS Nova Sonic service - added stubs for handling interruption and user-started-speaking frames 2025-05-07 13:52:51 -04:00
Paul Kompfner
0c255d2618 [WIP] AWS Nova Sonic service - added TTSTextFrame and reworked/cleaned up some bookkeeping logic 2025-05-07 13:52:51 -04:00
Paul Kompfner
a38206de9c [WIP] AWS Nova Sonic service - added TranscriptionFrame 2025-05-07 13:52:51 -04:00
Paul Kompfner
260f7c9b85 [WIP] AWS Nova Sonic service - format 2025-05-07 13:52:51 -04:00
Paul Kompfner
de294caed9 [WIP] AWS Nova Sonic service - added LLMFullResponseStartFrame, LLMTextFrame, and LLMFullResponseEndFrame 2025-05-07 13:52:51 -04:00
Paul Kompfner
e40aa4f99a [WIP] AWS Nova Sonic service - added TTSStartedFrame and TTSStoppedFrame 2025-05-07 13:52:51 -04:00
Paul Kompfner
b1d413b9be [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
8cbad070ad [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
13569a5a5a [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
d789334a60 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
7668b27fc0 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
6d30f441e8 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
a9e395b366 [WIP] AWS Nova Sonic service 2025-05-07 13:52:51 -04:00
Paul Kompfner
5e5626f04f [WIP] AWS Nova Sonic service 2025-05-07 13:52:47 -04:00
Aleix Conchillo Flaqué
d80aa5b44e Merge pull request #1753 from pipecat-ai/aleix/add-bedrock-support
Add support for Amazon Bedrock LLMs
2025-05-07 09:31:48 -07:00
Aleix Conchillo Flaqué
80ef6dc4de update README with AWS Bedrock and Transcribe 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
458549f7df AWSBedrockLLMService: fix function calling 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
a8405649d0 aws: use AWS prefix for all services 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
ce1a72850b tests: add bedrock context aggregator tests 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
58de381746 AWS: add missing utils 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
bed2e894a2 BedrockLLMService: pull initial system frame from messages 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
b4de98cfb7 AWS: various cleanups (logs, imports...) 2025-05-07 09:26:26 -07:00
Aleix Conchillo Flaqué
a4b9db9e07 fix formatting 2025-05-07 09:26:26 -07:00
Adithya Suresh
664111a3c9 Added cache related info to metrics 2025-05-07 09:26:26 -07:00
Adithya Suresh
aa964847f3 System param to be a list 2025-05-07 09:26:26 -07:00
Adithya Suresh
fa5cac7e0a Bug fix in content format 2025-05-07 09:26:26 -07:00
Adithya Suresh
b2b01861b2 Remove model restriction 2025-05-07 09:26:26 -07:00
Adithya Suresh
f014f718eb Restructured STT and enabled prosody tags for generative Polly 2025-05-07 09:26:26 -07:00
Adithya Suresh
05ae8d3ffa Removed OpenAI based context formatting 2025-05-07 09:26:26 -07:00
Adithya Suresh
88c9e08bd8 Updated tools parsing logic 2025-05-07 09:26:26 -07:00
Adithya Suresh
844f61dfea Initial implementation 2025-05-07 09:26:26 -07:00
Tico Ballagas
acb7d597cb Change example to use generative voices 2025-05-07 09:19:49 -07:00
Tico Ballagas
2b18f60261 Initial implementation of AWS Transcribe TTS 2025-05-07 09:19:49 -07:00
mattie ruth backman
5b66133a6c Revert breaking change in RTVI protocol for function calling 2025-05-07 12:08:28 -04:00
Mark Backman
0c5bc6a57a Merge pull request #1760 from WebinarGeek/wg/daily-active-speaker-event
DailyTransport: added on_active_speaker_changed event handler
2025-05-07 11:17:49 -04:00
Mark Backman
7981e00955 Merge pull request #1759 from pipecat-ai/mb/readme-nvidia-riva
Update README with Riva services
2025-05-07 11:13:51 -04:00
Dan Berg
5e39c0cfeb DailyTransport: added on_active_speaker_changed event handler 2025-05-07 15:22:30 +02:00
Mark Backman
a444701929 Update README with Riva services 2025-05-07 09:02:08 -04:00
Mark Backman
f6c1eb5d9d Merge pull request #1757 from pipecat-ai/mb/remove-canonical
Removing CanonicalMetricsService
2025-05-06 21:38:03 -04:00
Mark Backman
a1d46cb26b Removing CanonicalMetricsService 2025-05-06 21:23:23 -04:00
Aleix Conchillo Flaqué
99ab148d88 Merge pull request #1739 from pipecat-ai/aleix/observers-frame-pushed-class
BaseObserver: add FramePushed class and deprecate multiple arguments
2025-05-06 15:29:05 -07:00
Aleix Conchillo Flaqué
d69fa5dba5 update CHANGELOG with UltravoxSTTService fix 2025-05-06 15:26:25 -07:00
Aleix Conchillo Flaqué
0d30b000af BaseObserver: add FramePushed class and deprecated multiple arguments 2025-05-06 15:26:23 -07:00
Mark Backman
e7c0e742d2 Merge pull request #1752 from pipecat-ai/mb/deepgram-tts-aura-2
Update Deepgram TTS default voice to Aura 2 voice
2025-05-06 16:26:26 -04:00
Mark Backman
2aff2dcca3 Merge pull request #1751 from pipecat-ai/mb/11labs-enable_ssml_parsing
Add enable_ssml_parsing and enable_logging to ElevenLabsTTSService
2025-05-06 16:25:20 -04:00
Mark Backman
288f8865c8 Add enable_logging to ElevenLabsTTSService 2025-05-06 12:13:26 -04:00
Mark Backman
8691870bcb Update Deepgram TTS default voice to Aura 2 voice 2025-05-06 11:29:32 -04:00
Mark Backman
e06146c237 Add enable_ssml_parsing to ElevenLabsTTSService 2025-05-06 11:06:57 -04:00
Aleix Conchillo Flaqué
c68e990cda Merge pull request #1748 from pipecat-ai/aleix/task-manager-dictionary
task manager dictionary and cleanup PipelineTask
2025-05-06 07:57:53 -07:00
Aleix Conchillo Flaqué
4583905313 PipelineTask: cleanup if task is cancelled from outside Pipecat 2025-05-05 21:33:21 -07:00
Aleix Conchillo Flaqué
9cc498b1fa TaskManager: use a dictionary instead of a set to store tasks 2025-05-05 21:27:49 -07:00
Mark Backman
b3c5dc4045 Merge pull request #1443 from adithyaxx/anthropic-client-bug-fixes
Handle missing token counts in AsyncAnthropicBedrock client properly
2025-05-05 21:13:11 -04:00
Filipi Fuchter
56ca7360ae Fixing versions 2025-05-05 19:11:59 -03:00
Filipi Fuchter
d5ab3251f0 Bumping the dependencies, updating readme, adding .gitignore. 2025-05-05 18:43:04 -03:00
Filipi Fuchter
915c284420 Fixing readme 2025-05-05 18:32:04 -03:00
Aleix Conchillo Flaqué
3824da7261 Merge pull request #1745 from pipecat-ai/aleix/make-sure-transports-are-ready
only send data to transports after they are really ready
2025-05-05 14:21:36 -07:00
Filipi Fuchter
40154824e8 Creating a RN example for simple-chatbot 2025-05-05 18:17:39 -03:00
Aleix Conchillo Flaqué
855d567b1e only send data to transports after they are really ready 2025-05-05 14:06:58 -07:00
Mark Backman
b323a7bd88 Merge pull request #1742 from pipecat-ai/mb/pcc-krisp-filter
Update pipecat-cloud-example to use Krisp in PCC deployment only
2025-05-05 15:46:12 -04:00
Mark Backman
fa011d0018 Update pipecat-cloud-example to use Krisp in PCC deployment only 2025-05-05 15:09:29 -04:00
Aleix Conchillo Flaqué
e15fa8777a Merge pull request #1737 from CerebriumAI/kyle/fix-ultravox-spacing
[Fix] Ultravox frame spacing issue
2025-05-05 09:34:49 -07:00
Aleix Conchillo Flaqué
2143a6d927 Merge pull request #1732 from pipecat-ai/aleix/daily-remote-custom-tracks
DailyTransport: remove custom tracks before leaving
2025-05-05 08:44:11 -07:00
Aleix Conchillo Flaqué
044e2d3e73 DailyTransport: remove custom tracks before leaving 2025-05-05 08:35:35 -07:00
Kyle Gani
be112ec63f Merge branch 'kyle/fix-ultravox-performance' of github.com:CerebriumAI/pipecat into kyle/fix-ultravox-performance 2025-05-05 17:13:26 +02:00
Kyle Gani
d2f56c4e8f Fix: Spacing issue 2025-05-05 17:13:21 +02:00
Mark Backman
ddc6a9c695 Merge pull request #1670 from pipecat-ai/mb/daily-twilio-sip-example
Add standalone Daily + Twilio SIP example
2025-05-05 10:57:16 -04:00
Mark Backman
2bebdbc371 Merge pull request #1671 from pipecat-ai/khk/rime-arcana
support for rime arcana model
2025-05-05 10:54:50 -04:00
Mark Backman
8b9f1f0608 Add a changelog entry 2025-05-05 10:51:46 -04:00
Kwindla Hultman Kramer
b25f3b2ed2 support for rime arcana model 2025-05-05 10:50:46 -04:00
Mark Backman
a995cf81b6 Merge pull request #1724 from pipecat-ai/mb/demo-fixes
Demo fixes
2025-05-05 08:44:57 -04:00
Aleix Conchillo Flaqué
75d261639f Merge pull request #1726 from pipecat-ai/aleix/pipecat-0.0.66
update CHANGELOG for pipecat 0.0.66
2025-05-02 20:54:57 -07:00
Aleix Conchillo Flaqué
f720d795d0 update CHANGELOG for pipecat 0.0.66 2025-05-02 20:29:51 -07:00
Aleix Conchillo Flaqué
f6fe83e358 Merge pull request #1725 from pipecat-ai/aleix/update-daily-python-0.18.1
update to daily-python 0.18.1
2025-05-02 20:27:50 -07:00
Mark Backman
0513d0b6a8 Update README 2025-05-02 22:44:50 -04:00
Mark Backman
0679bb217d Remove Twilio from phone-chatbot directory 2025-05-02 22:18:50 -04:00
Mark Backman
38bd55e518 Update README 2025-05-02 22:18:50 -04:00
Mark Backman
65c7423280 Add other dial-in event handlers 2025-05-02 22:18:50 -04:00
Mark Backman
f24a85cc94 Add logic to only forward the first on_dialin_ready event 2025-05-02 22:18:50 -04:00
Mark Backman
53887b7c98 Display phone number in WebRTC call 2025-05-02 22:18:50 -04:00
Mark Backman
523c012c38 Use a Twilio asset to ring the phone throughout 2025-05-02 22:18:50 -04:00
Mark Backman
97c28989c1 Add standalone Daily + Twilio SIP example 2025-05-02 22:18:50 -04:00
Mark Backman
c19be6ebb2 Demo fixes 2025-05-02 20:58:10 -04:00
Aleix Conchillo Flaqué
54971a0735 update to daily-python 0.18.1 2025-05-02 17:47:44 -07:00
Mark Backman
4513e81e13 Merge pull request #1723 from pipecat-ai/mb/base-output-bot-speaking-log
Only display the destination in the bot started/stopped speaking log …
2025-05-02 17:32:47 -04:00
Mark Backman
872204b795 Only display the destination in the bot started/stopped speaking log when there is a desintation 2025-05-02 17:29:28 -04:00
Aleix Conchillo Flaqué
a94cbfe6f5 Merge pull request #1722 from pipecat-ai/aleix/base-output-transport-audio-task-fix
BaseOutputTransport: always initialize audio task
2025-05-02 14:26:30 -07:00
Aleix Conchillo Flaqué
7152faafb2 BaseOutputTransport: always initialize audio task
We also use the audio task to also send synchronized images with audio.
2025-05-02 14:23:15 -07:00
Mark Backman
e6aadaccd8 Merge pull request #1721 from pipecat-ai/mb/simli-silent-frames
Fix: SimliVideoService was continuously emitting audio, preventing Bo…
2025-05-02 16:44:39 -04:00
Mark Backman
3a73aa71b8 Merge pull request #1613 from pipecat-ai/mb/improve-storybot-readme
demo: Restructure storytelling-chatbot directory, update README steps…
2025-05-02 16:39:59 -04:00
Mark Backman
814e7509e1 demo: Restructure storytelling-chatbot directory, update README steps, link to vercel demo 2025-05-02 16:37:37 -04:00
Vanessa Pyne
e0cf5ec016 Merge pull request #1705 from pipecat-ai/vp-update-nvidia-models
Riva Service: add magpie-tts-multilingual model
2025-05-02 15:34:23 -05:00
vipyne
667bd32e6a Riva: remove deprecated lines in example 2025-05-02 15:33:10 -05:00
vipyne
b2ecd83706 update CHANGELOG with Riva details 2025-05-02 15:33:10 -05:00
vipyne
b2754117c8 Riva: refactor function_id and model_name 2025-05-02 15:33:10 -05:00
vipyne
6c428c303b update magpie voice 2025-05-02 15:33:10 -05:00
Mark Backman
e7d889a143 Update RivaSTTService to use by default 2025-05-02 15:33:10 -05:00
Mark Backman
da60e7069b Update pyproject.toml to use nvidia-riva-client 2.19.1 2025-05-02 15:33:10 -05:00
Mark Backman
c14406a3b9 Demos use the latest services 2025-05-02 15:33:10 -05:00
Mark Backman
725ab5ec21 Small fixes: No default api_key of None, ParakeetSTTService uses RivaSTTService.InputParams 2025-05-02 15:33:10 -05:00
Mark Backman
daf9d47e58 Update RivaSegmentedSTTService 2025-05-02 15:33:10 -05:00
vipyne
63a65627a2 Riva Service: add magpie-tts-multilingual model 2025-05-02 15:33:10 -05:00
Mark Backman
02c07755b0 Add Changelog entry for PR 1707 2025-05-02 15:33:10 -05:00
Matt Kim
15cbd18acc [Rime] Add phonemizeBetweenBrackets and pauseBetweenBrackets to RimeTTSService (ws)
There is a fix incoming in
2025-05-02 15:33:10 -05:00
Kwindla Hultman Kramer
93c40b87dc small groq updates 2025-05-02 15:33:10 -05:00
Mark Backman
eeaa9f67a1 Fix: SimliVideoService was continuously emitting audio, preventing BotStoppedSpeakingFrame from being sent 2025-05-02 16:32:42 -04:00
Mark Backman
b60691c7b2 Merge pull request #1720 from pipecat-ai/mb/changelog-pr-1707
Add Changelog entry for PR 1707
2025-05-02 16:13:40 -04:00
Mark Backman
2bb1b0b343 Add Changelog entry for PR 1707 2025-05-02 16:09:50 -04:00
Mark Backman
047ef9f86c Merge pull request #1707 from rimelabs/matt/rime/url_param_serialization
[Rime] Add new params to RimeTTSService
2025-05-02 16:08:01 -04:00
Kwindla Hultman Kramer
9a2c603c91 Merge pull request #1711 from pipecat-ai/khk/groq-updates 2025-05-02 12:21:15 -07:00
Filipi da Silva Fuchter
94c4169407 Merge pull request #1717 from pipecat-ai/local_smart_turn_torch
Local smart turn torch
2025-05-02 15:53:30 -03:00
Filipi Fuchter
cb8a551db8 Mentioning the new LocalSmartTurnAnalyzer in the changelog. 2025-05-02 14:32:18 -03:00
Filipi Fuchter
779f09af70 Fixing lint. 2025-05-02 14:22:38 -03:00
Filipi Fuchter
19dc0f2bfb New example using the local smart turn 2025-05-02 14:21:42 -03:00
Filipi Fuchter
f0709e22ba Creating a local smart turn using torch. 2025-05-02 14:21:29 -03:00
Mark Backman
8250736f5e Merge pull request #1708 from pipecat-ai/mb/gemini-user-context
Push GeminiMultimodalLiveLLMService TranscriptionFrame Upstream, remo…
2025-05-02 13:10:27 -04:00
Mark Backman
83348a9f93 Merge pull request #1714 from pipecat-ai/mb/fix-gemini-text-modality
Restore TEXT modalities support to GeminiMultimodalLiveLLMService
2025-05-02 10:41:05 -04:00
Mark Backman
96d40903a9 Only send TTSStoppedFrame from Gemini when in AUDIO mode, only send one LLMFullResponseEndFrame 2025-05-02 10:18:53 -04:00
Aleix Conchillo Flaqué
2560811805 Merge pull request #1697 from pipecat-ai/aleix/daily-custom-audio-tracks
add support for multiple transport destinations
2025-05-02 06:34:09 -07:00
Mark Backman
2b8c44c008 Merge pull request #1710 from pipecat-ai/mb/openai-context-aggregation
fix: OpenAIRealtimeBetaLLMService writes two assistant messages to th…
2025-05-02 07:43:35 -04:00
Mark Backman
38e2d37674 Restore TEXT modalities support to GeminiMultimodalLiveLLMService 2025-05-02 07:36:12 -04:00
Vanessa Pyne
6278561f88 Merge pull request #1709 from pipecat-ai/vp-fix-fastpitch-params-update
Riva TTS: update FastPitch params
2025-05-01 21:23:10 -05:00
Aleix Conchillo Flaqué
750e79c1ce DailyParams: rename to camera/microphone_out_enabled 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
71eb2963c5 examples: added daily-custom-tracks 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
f44e2c86ea BaseOutputTransport: compute sample_rate and audio_chunk_size in main class 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
afe1f0df8c DailyTransport: make sure we can write audio frames to destination 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
458fddfb48 update CHANGELOG with new Daily and Transport features 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
8d915c5ccb DailyParams: allow enabling/disabling camera/microphone tracks 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
304153dd03 TTSService: set transport destination to all TTS frames 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
a6781b7352 rename destination to transport_destination 2025-05-01 19:17:14 -07:00
Aleix Conchillo Flaqué
5ad0058303 update CHANGELOG with frame source/destination support 2025-05-01 19:11:13 -07:00
Aleix Conchillo Flaqué
75c039de33 examples: add daily-multi-translation 2025-05-01 19:11:13 -07:00
Aleix Conchillo Flaqué
74e3c3677e DailyTransport: fix audio/video renderers registration 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
dc20327f10 DailyTransport: register audio destination and use custom tracks 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
e738affd29 BaseOutputTransport: allow sending audio/video to multiple destinations 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
ef3d732607 DailyTransport: allow capturing multiple simultaneous audio/video sources 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
6d63cff1bf DailyTransport: custom audio tracks support 2025-05-01 18:58:44 -07:00
Aleix Conchillo Flaqué
12f42605a1 pyproject: update daily-python to 0.18.0 2025-05-01 18:58:44 -07:00
Kwindla Hultman Kramer
fac3337927 small groq updates 2025-05-01 17:09:15 -07:00
Mark Backman
76d198151c Push GeminiMultimodalLiveLLMService TranscriptionFrame Upstream, remove direct context addition 2025-05-01 15:41:04 -04:00
Mark Backman
6a907058de fix: OpenAIRealtimeBetaLLMService writes two assistant messages to the context 2025-05-01 15:37:39 -04:00
vipyne
6e1f531f64 Riva TTS: update FastPitch params
91138c3f66 (diff-ece228577b1d233ce600a948243f90cece53e3a9b89554a0b27a48bc4d6e0fdfR45)
2025-05-01 11:14:41 -05:00
Matt Kim
4232cca5b6 [Rime] Add phonemizeBetweenBrackets and pauseBetweenBrackets to RimeTTSService (ws)
There is a fix incoming in
2025-04-30 18:09:22 -07:00
Jin Kim
cf2f249f8a Use "use_original_timestamps" only for sonic-2 model 2025-04-27 19:33:14 +09:00
Prem Adithya
c510870736 Merge branch 'pipecat-ai:main' into anthropic-client-bug-fixes 2025-04-04 16:41:04 +11:00
Adithya Suresh
e8783f6a33 Handle cache token counts being none 2025-03-31 15:25:11 +11:00
fatwang2
8cda4512ad Merge branch 'pipecat-ai:main' into main 2025-02-06 10:50:25 +08:00
fatwang2
fc90bdc638 changed to HailuoHttpTTSService 2025-01-19 09:43:48 +08:00
fatwang2
5a88165a26 Merge branch 'pipecat-ai:main' into main 2025-01-19 09:40:08 +08:00
fatwang2
3466842cd4 add hailuo tts service 2025-01-17 12:46:05 +08:00
1231 changed files with 98929 additions and 97270 deletions

View File

@@ -1,48 +0,0 @@
name: android
on:
push:
branches:
- main
paths:
- "examples/simple-chatbot/client/android/**"
pull_request:
branches:
- "**"
paths:
- "examples/simple-chatbot/client/android/**"
workflow_dispatch:
inputs:
sdk_git_ref:
type: string
description: "Which git ref of the app to build"
concurrency:
group: build-android-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
sdk:
name: "Simple chatbot demo"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.sdk_git_ref || github.ref }}
- name: "Install Java"
uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: '17'
- name: Build demo app
working-directory: examples/simple-chatbot/client/android
run: ./gradlew :simple-chatbot-client:assembleDebug
- name: Upload demo APK
uses: actions/upload-artifact@v4
with:
name: Simple Chatbot Android Client
path: examples/simple-chatbot/client/android/simple-chatbot-client/build/outputs/apk/debug/simple-chatbot-client-debug.apk

View File

@@ -21,24 +21,20 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Build project
run: |
source .venv/bin/activate
python -m build
- name: Install project and other Python dependencies
run: |
source .venv/bin/activate
pip install --editable .
run: uv build
- name: Install project in editable mode
run: uv pip install --editable .

View File

@@ -18,35 +18,28 @@ jobs:
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
version: "latest"
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Cache virtual environment
uses: actions/cache@v3
with:
# We are hashing dev-requirements.txt and test-requirements.txt which
# contain all dependencies needed to run the tests.
key: venv-${{ runner.os }}-${{ steps.setup_python.outputs.python-version}}-${{ hashFiles('dev-requirements.txt') }}-${{ hashFiles('test-requirements.txt') }}
path: .venv
run: uv python install 3.12
- name: Install system packages
id: install_system_packages
run: |
sudo apt-get install -y portaudio19-dev
- name: Setup virtual environment
- name: Install dependencies
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt -r test-requirements.txt
uv sync --group dev --extra anthropic --extra aws --extra google --extra langchain
- name: Run tests with coverage
run: |
source .venv/bin/activate
coverage run
coverage xml
uv run coverage run
uv run coverage xml
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
with:

View File

@@ -17,30 +17,27 @@ concurrency:
jobs:
ruff-format:
name: "Formatting checker"
name: "Code quality checks"
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: "3.10"
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install development Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: "latest"
- name: Set up Python
run: uv python install 3.10
- name: Install development dependencies
run: uv sync --group dev
- name: Ruff formatter
id: ruff-format
run: |
source .venv/bin/activate
ruff format --diff
- name: Ruff import linter
run: uv run ruff format --diff
- name: Ruff linter (all rules)
id: ruff-check
run: |
source .venv/bin/activate
ruff check --select I
run: uv run ruff check

View File

@@ -5,35 +5,29 @@ on:
inputs:
gitref:
type: string
description: "what git ref to build"
description: 'what git tag to build (e.g. v0.0.74)'
required: true
jobs:
build:
name: "Build and upload wheels"
name: 'Build and upload wheels'
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.gitref }}
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: 'latest'
- name: Set up Python
run: uv python install 3.12
- name: Install development dependencies
run: uv sync --group dev
- name: Build project
run: |
source .venv/bin/activate
python -m build
run: uv build
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
@@ -41,9 +35,9 @@ jobs:
path: ./dist
publish-to-pypi:
name: "Publish to PyPI"
name: 'Publish to PyPI'
runs-on: ubuntu-latest
needs: [ build ]
needs: [build]
environment:
name: pypi
url: https://pypi.org/p/pipecat-ai
@@ -62,12 +56,12 @@ jobs:
print-hash: true
publish-to-test-pypi:
name: "Publish to Test PyPI"
name: 'Publish to Test PyPI'
runs-on: ubuntu-latest
needs: [ build ]
needs: [build]
environment:
name: testpypi
url: https://pypi.org/p/pipecat-ai
url: https://test.pypi.org/p/pipecat-ai
permissions:
id-token: write
steps:
@@ -76,7 +70,7 @@ jobs:
with:
name: wheels
path: ./dist
- name: Publish to PyPI
- name: Publish to Test PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
verbose: true

View File

@@ -4,7 +4,7 @@ on: workflow_dispatch
jobs:
build:
name: "Build and upload wheels"
name: 'Build and upload wheels'
runs-on: ubuntu-latest
steps:
- name: Checkout repo
@@ -12,23 +12,16 @@ jobs:
with:
fetch-tags: true
fetch-depth: 100
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
python-version: '3.10'
- name: Setup virtual environment
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt
version: 'latest'
- name: Set up Python
run: uv python install 3.12
- name: Install development dependencies
run: uv sync --group dev
- name: Build project
run: |
source .venv/bin/activate
python -m build
run: uv build
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
@@ -36,12 +29,12 @@ jobs:
path: ./dist
publish-to-test-pypi:
name: "Publish to Test PyPI"
name: 'Publish to Test PyPI'
runs-on: ubuntu-latest
needs: [ build ]
needs: [build]
environment:
name: testpypi
url: https://pypi.org/p/pipecat-ai
url: https://test.pypi.org/p/pipecat-ai
permissions:
id-token: write
steps:
@@ -50,7 +43,7 @@ jobs:
with:
name: wheels
path: ./dist
- name: Publish to PyPI
- name: Publish to Test PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
verbose: true

View File

@@ -0,0 +1,61 @@
name: Python Compatibility Test
on:
push:
branches: [main, develop]
paths: ['pyproject.toml']
pull_request:
branches: [main, develop]
paths: ['pyproject.toml']
jobs:
test-compatibility:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ['3.10.18', '3.11.13', '3.12.11', '3.13.5']
name: Python ${{ matrix.python-version }}
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y \
portaudio19-dev \
libcairo2-dev \
libgirepository1.0-dev \
pkg-config
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
version: 'latest'
- name: Set up Python ${{ matrix.python-version }}
run: |
uv python install ${{ matrix.python-version }}
uv python pin ${{ matrix.python-version }}
- name: Test uv sync with all extras (Python < 3.13)
if: "!startsWith(matrix.python-version, '3.13.')"
run: |
uv sync --group dev --all-extras --no-extra krisp
- name: Test uv sync without PyTorch extras (Python 3.13+)
if: startsWith(matrix.python-version, '3.13.')
run: |
uv sync --group dev --all-extras \
--no-extra krisp \
--no-extra ultravox \
--no-extra local-smart-turn \
--no-extra moondream \
--no-extra mlx-whisper
- name: Verify installation
run: |
uv run python --version
uv run python -c "import pipecat; print('✅ Pipecat imports successfully')"

51
.github/workflows/sync-quickstart.yaml vendored Normal file
View File

@@ -0,0 +1,51 @@
name: Sync Quickstart to pipecat-quickstart repo
on:
push:
branches: [main]
paths:
- 'examples/quickstart/**'
workflow_dispatch: # Manual trigger
jobs:
sync-quickstart:
runs-on: ubuntu-latest
steps:
- name: Checkout main repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Checkout quickstart repo
uses: actions/checkout@v4
with:
repository: pipecat-ai/pipecat-quickstart
token: ${{ secrets.QUICKSTART_SYNC_TOKEN }}
path: quickstart-repo
- name: Sync files (excluding uv.lock and README.md)
run: |
# Copy all files except uv.lock and README.md
find examples/quickstart -type f \
-not -name "README.md" \
-not -name "uv.lock" \
-exec cp {} quickstart-repo/ \;
- name: Commit and push changes
run: |
cd quickstart-repo
git config user.name "GitHub Action"
git config user.email "action@github.com"
git add .
# Only commit if there are changes
if ! git diff --staged --quiet; then
git commit -m "Sync from pipecat main repo
Updated files from examples/quickstart/
Commit: ${{ github.sha }}
"
git push
else
echo "No changes to sync"
fi

View File

@@ -22,31 +22,23 @@ jobs:
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v3
with:
version: "latest"
- name: Set up Python
id: setup_python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Cache virtual environment
uses: actions/cache@v3
with:
# We are hashing dev-requirements.txt and test-requirements.txt which
# contain all dependencies needed to run the tests.
key: venv-${{ runner.os }}-${{ steps.setup_python.outputs.python-version}}-${{ hashFiles('dev-requirements.txt') }}-${{ hashFiles('test-requirements.txt') }}
path: .venv
run: uv python install 3.12
- name: Install system packages
id: install_system_packages
run: |
sudo apt-get install -y portaudio19-dev
- name: Setup virtual environment
- name: Install dependencies
run: |
python -m venv .venv
- name: Install basic Python dependencies
run: |
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r dev-requirements.txt -r test-requirements.txt
uv sync --group dev --extra anthropic --extra aws --extra google --extra langchain
- name: Test with pytest
run: |
source .venv/bin/activate
pytest
uv run pytest

7
.gitignore vendored
View File

@@ -31,8 +31,6 @@ MANIFEST
fly.toml
# Examples
examples/telnyx-chatbot/templates/streams.xml
examples/twilio-chatbot/templates/streams.xml
examples/**/node_modules/
examples/**/.expo/
examples/**/dist/
@@ -50,4 +48,7 @@ examples/**/web-build/
# Documentation
docs/api/_build/
docs/api/api
docs/api/api
# uv
.python-version

View File

@@ -1,8 +1,8 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.9.7
rev: v0.12.1
hooks:
- id: ruff
language_version: python3
args: [ --select, I, ]
args: [--fix]
- id: ruff-format

View File

@@ -9,22 +9,14 @@ build:
- python3-dev
- libasound2-dev
jobs:
pre_build:
- python -m pip install --upgrade pip
- pip install wheel setuptools
post_build:
- echo "Build completed"
post_install:
- pip install uv
- UV_PROJECT_ENVIRONMENT=$READTHEDOCS_VIRTUALENV_PATH uv sync --group docs --all-extras --no-extra krisp --no-extra gstreamer --no-extra ultravox --no-extra local_smart_turn --no-extra moondream --no-extra riva --no-extra mlx-whisper
sphinx:
configuration: docs/api/conf.py
fail_on_warning: false
python:
install:
- requirements: docs/api/requirements.txt
- method: pip
path: .
search:
ranking:
api/*: 5

File diff suppressed because it is too large Load Diff

336
COMMUNITY_INTEGRATIONS.md Normal file
View File

@@ -0,0 +1,336 @@
# Community Integrations Guide
Pipecat welcomes community-maintained integrations! As our ecosystem grows, we've established a process for any developer to create and maintain their own service integrations while ensuring discoverability for the Pipecat community.
## Overview
**What we support:** Community-maintained integrations that live in separate repositories and are maintained by their authors.
**What we don't do:** The Pipecat team does not code review, test, or maintain community integrations. We provide guidance and list approved integrations for discoverability.
**Why this approach:** This allows the community to move quickly while keeping the Pipecat core team focused on maintaining the framework itself.
## Submitting your Integration
To be listed as an official community integration, follow these steps:
### Step 1: Build Your Integration
Create your integration following the patterns and examples shown in the "Integration Patterns and Examples" section below.
### Step 2: Set Up Your Repository
Your repository must contain these components:
- **Source code** - Complete implementation following Pipecat patterns
- **Foundational example** - Single file example showing basic usage (see [Pipecat examples](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational))
- **README.md** - Must include:
- Introduction and explanation of your integration
- Installation instructions
- Usage instructions with Pipecat Pipeline
- How to run your example
- Pipecat version compatibility (e.g., "Tested with Pipecat v0.0.86")
- Company attribution: If you work for the company providing the service, please mention this in your README. This helps build confidence that the integration will be actively maintained.
- **LICENSE** - Permissive license (BSD-2 like Pipecat, or equivalent open source terms)
- **Code documentation** - Source code with docstrings (we recommend following [Pipecat's docstring conventions](https://github.com/pipecat-ai/pipecat/blob/main/CONTRIBUTING.md#docstring-conventions))
- **Changelog** - Maintain a changelog for version updates
### Step 3: Join Discord
Join our Discord: https://discord.gg/pipecat
### Step 4: Submit for Listing
Submit a pull request to add your integration to our [Community Integrations documentation page](https://docs.pipecat.ai/server/services/community-integrations).
**To submit:**
1. Fork the [Pipecat docs repository](https://github.com/pipecat-ai/docs)
2. Edit the file `server/services/community-integrations.mdx`
3. Add your integration to the appropriate service category table with:
- Service name
- Link to your repository
- Maintainer GitHub username(s)
4. Include a link to your demo video (approx 30-60 seconds) in your PR description showing:
- Core functionality of your integration
- Handling of an interruption (if applicable to service type)
5. Submit your pull request
Once your PR is submitted, post in the `#community-integrations` Discord channel to let us know.
## Integration Patterns and Examples
### STT (Speech-to-Text) Services
#### Websocket-based Services
**Base class:** `STTService`
**Examples:**
- [DeepgramSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/deepgram/stt.py)
- [SpeechmaticsSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/speechmatics/stt.py)
#### File-based Services
**Base class:** `SegmentedSTTService`
**Examples:**
- [RivaSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/riva/stt.py)
- [FalSTTService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/fal/stt.py)
#### Key requirements:
- STT services should push `InterimTranscriptionFrames` and `TranscriptionFrames`
- If confidence values are available, filter for values >50% confidence
### LLM (Large Language Model) Services
#### OpenAI-Compatible Services
**Base class:** `OpenAILLMService`
**Examples:**
- [AzureLLMService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/azure/llm.py)
- [GrokLLMService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/grok/llm.py) - Shows overriding the base class where needed
#### Non-OpenAI Compatible Services
**Requires:** Full implementation
**Examples:**
- [AnthropicLLMService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/anthropic/llm.py)
- [GoogleLLMService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/google/llm.py)
#### Key requirements:
- **Frame sequence:** Output must follow this frame sequence pattern:
- `LLMFullResponseStartFrame` - Signals the start of an LLM response
- `LLMTextFrame` - Contains LLM content, typically streamed as tokens
- `LLMFullResponseEndFrame` - Signals the end of an LLM response
- **Context aggregation:** Implement context aggregation to collect user and assistant content:
- Aggregators come in pairs with a `user()` instance and `assistant()` instance
- Context must adhere to the `LLMContext` universal format
- Aggregators should handle adding messages, function calls, and images to the context
### TTS (Text-to-Speech) Services
#### AudioContextWordTTSService
**Use for:** Websocket-based services supporting word/timestamp alignment
**Example:**
- [CartesiaTTSService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/cartesia/tts.py)
#### InterruptibleTTSService
**Use for:** Websocket-based services without word/timestamp alignment, requiring disconnection on interruption
**Example:**
- [SarvamTTSService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/sarvam/tts.py)
#### WordTTSService
**Use for:** HTTP-based services supporting word/timestamp alignment
**Example:**
- [ElevenLabsHttpTTSService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/elevenlabs/tts.py)
#### TTSService
**Use for:** HTTP-based services without word/timestamp alignment
**Example:**
- [GoogleHttpTTSService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/google/tts.py)
#### Key requirements:
- For websocket services, use asyncio WebSocket implementation (required for v13+ support)
- Handle idle service timeouts with keepalives
- TTSServices push both audio (`TTSRawAudioFrame`) and text (`TTSTextFrame`) frames
### Telephony Serializers
Pipecat supports telephony provider integration using websocket connections to exchange MediaStreams. These services use a FrameSerializer to serialize and deserialize inputs from the FastAPIWebsocketTransport.
**Examples:**
- [Twilio](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/serializers/twilio.py)
- [Telnyx](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/serializers/telnyx.py)
#### Key requirements:
- Include hang-up functionality using the provider's native API, ideally using `aiohttp`
- Support DTMF (dual-tone multi-frequency) events if the provider supports them:
- Deserialize DTMF events from the provider's protocol to `InputDTMFFrame`
- Use `KeypadEntry` enum for valid keypad entries (0-9, \*, #, A-D)
- Handle invalid DTMF digits gracefully by returning `None`
### Image Generation Services
**Base class:** `ImageGenService`
**Examples:**
- [FalImageGenService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/fal/image.py)
- [GoogleImageGenService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/google/image.py)
#### Key requirements:
- Must implement `run_image_gen` method returning an `AsyncGenerator`
### Vision Services
Vision services process images and provide analysis such as descriptions, object detection, or visual question answering.
**Base class:** `VisionService`
**Example:**
- [MoondreamVisionService](https://github.com/pipecat-ai/pipecat/blob/main/src/pipecat/services/moondream/vision.py)
#### Key requirements:
- Must implement `run_vision` method that takes an `LLMContext` and returns an `AsyncGenerator[Frame, None]`
- The method processes the latest image in the context and yields frames with analysis results
- Typically yields `TextFrame` objects containing descriptions or answers
## Implementation Guidelines
### Naming Conventions
- **STT:** `VendorSTTService`
- **LLM:** `VendorLLMService`
- **TTS:**
- Websocket: `VendorTTSService`
- HTTP: `VendorHttpTTSService`
- **Image:** `VendorImageGenService`
- **Vision:** `VendorVisionService`
- **Telephony:** `VendorFrameSerializer`
### Metrics Support
Enable metrics in your service:
```python
def can_generate_metrics(self) -> bool:
"""Check if this service can generate processing metrics.
Returns:
True, as this service supports metrics.
"""
return True
```
### Dynamic Settings Updates
STT, LLM, and TTS services support `ServiceUpdateSettingsFrame` for dynamic configuration changes. The base STTService has an `_update_settings()` method that handles settings, and the private `_settings` `Dict` is used to store settings and provide access to the subclass.
```python
async def set_language(self, language: Language):
"""Set the recognition language and reconnect.
Args:
language: The language to use for speech recognition.
"""
logger.info(f"Switching STT language to: [{language}]")
self._settings["language"] = language
await self._disconnect()
await self._connect()
```
Note that, in this example, Deepgram requires the websocket connection be disconnected and reconnected to reinitialize the service with the new value. Consider if your service requires reconnection.
### Sample Rate Handling
Sample rates are set via PipelineParams and passed to each frame processor at initialization. The pattern is to _not_ set the sample rate value in the constructor of a given service. Instead, use the `start()` method to initialize sample rates from the frame:
```python
async def start(self, frame: StartFrame):
"""Start the service."""
await super().start(frame)
self._settings["output_format"]["sample_rate"] = self.sample_rate
await self._connect()
```
Note that `self.sample_rate` is a `@property` set in the TTSService base class, which provides access to the private sample rate value obtained from the StartFrame.
### Tracing Decorators
Use Pipecat's tracing decorators:
- **STT:** `@traced_stt` - decorate a function that handles `transcript`, `is_final`, `language` as args
- **LLM:** `@traced_llm` - decorate the `_process_context()` method
- **TTS:** `@traced_tts` - decorate the `run_tts()` method
## Best Practices
### Packaging and Distribution
- Use [uv](https://docs.astral.sh/uv/) for packaging (encouraged)
- Consider releasing to PyPI for easier installation
- Follow semantic versioning principles
- Maintain a changelog
### HTTP Communication
For REST-based communication, use aiohttp. Pipecat includes this as a required dependency, so using it prevents adding an additional dependency to your integration.
### Error Handling
- Wrap API calls in appropriate try/catch blocks
- Handle rate limits and network failures gracefully
- Provide meaningful error messages
- When errors occur, raise exceptions AND push `ErrorFrame`s to notify the pipeline:
```python
from pipecat.frames.frames import ErrorFrame
try:
# Your API call
result = await self._make_api_call()
except Exception as e:
# Push error frame to pipeline
await self.push_error(ErrorFrame(error=f"{self} error: {e}"))
# Raise or handle as appropriate
raise
```
### Testing
- Your foundational example serves as a valuable integration-level test
- Unit tests are nice to have. As the Pipecat teams provides better guidance, we will encourage unit testing more
## Disclaimer
Community integrations are community-maintained and not officially supported by the Pipecat team. Users should evaluate these integrations independently. The Pipecat team reserves the right to remove listings that become unmaintained or problematic.
## Staying Up to Date
Pipecat evolves rapidly to support the latest AI technologies and patterns. While we strive to minimize breaking changes, they do occur as the framework matures.
**We strongly recommend:**
- Join our Discord at https://discord.gg/pipecat and monitor the `#announcements` channel for release notifications
- Follow our changelog: https://github.com/pipecat-ai/pipecat/blob/main/CHANGELOG.md
- Test your integration against new Pipecat releases promptly
- Update your README with the last tested Pipecat version
This helps ensure your integration remains compatible and your users have clear expectations about version support.
## Questions?
Join our Discord community at https://discord.gg/pipecat and post in the `#community-integrations` channel for guidance and support.
For additional questions, you can also reach out to us at pipecat-ai@daily.co.

View File

@@ -1,5 +1,9 @@
## Contributing to Pipecat
**Want to add a new service integration?**
We encourage community-maintained integrations! Please see our [Community Integration Guide](COMMUNITY_INTEGRATIONS.md) for the process and requirements.
**Want to contribute to Pipecat core?**
We welcome contributions of all kinds! Your help is appreciated. Follow these steps to get involved:
1. **Fork this repository**: Start by forking the Pipecat Documentation repository to your GitHub account.
@@ -31,6 +35,23 @@ git push origin your-branch-name
Our maintainers will review your PR, and once everything is good, your contributions will be merged!
## Dependency Management
This project uses [uv](https://docs.astral.sh/uv/) for dependency management. The `uv.lock` file is committed to ensure reproducible builds.
### Adding or Updating Dependencies
1. Edit `pyproject.toml` to add/update dependencies
2. Run `uv lock` to update the lockfile with new dependency resolution
3. Run `uv sync` to install the updated dependencies locally
4. Always commit both files together:
```bash
git add pyproject.toml uv.lock
git commit -m "feat: add new dependency for feature X"
```
**Important:** Never manually edit `uv.lock`. It's auto-generated by `uv lock`.
## Code Style and Documentation
### Python Code Style
@@ -41,36 +62,150 @@ We use Ruff for code linting and formatting. Please ensure your code passes all
We follow Google-style docstrings with these specific conventions:
- Class docstrings should fully document all parameters used in `__init__`
- We don't require separate docstrings for `__init__` methods when parameters are documented in the class docstring
- Property methods should have docstrings explaining their purpose and return value
**Regular Classes:**
Example of correctly documented class:
- Class docstring describes the class purpose and key functionality
- `__init__` method has its own docstring with complete `Args:` section documenting all parameters
- All public methods must have docstrings with `Args:` and `Returns:` sections as appropriate
**Dataclasses:**
- Class docstring describes the purpose and documents all fields in a `Parameters:` section
- No `__init__` docstring (auto-generated)
**Properties:**
- Must have docstrings with `Returns:` section
**Abstract Methods:**
- Must have docstrings explaining what subclasses should implement
**`__init__.py` Files:**
- **Skip docstrings** for pure import/re-export modules
- **Add brief docstrings** for top-level packages or those with initialization logic
**Enums:**
- Class docstring describes the enumeration purpose
- Use `Parameters:` section to document each enum value and its meaning
- No `__init__` docstring (Enums don't have custom constructors)
**Code Examples in Docstrings:**
- Use `Examples:` as a section header for multiple examples
- Use descriptive text followed by double colons (`::`) for each example
- **Always include a blank line after the `::"`**
- Indent all code consistently within each block
- Separate multiple examples with blank lines for readability
**Lists and Bullets in Docstrings:**
- Use dashes (`-`) for bullet points, not asterisks (`*`)
- **Add a blank line before bullet lists** when they follow a colon
- Use section headers like "Supported features:" or "Behavior:" before lists
- For complex nested information, consider using paragraph format instead
**Deprecations:**
- Use `warnings.warn()` in code for runtime deprecation warnings
- Add `.. deprecated::` directive in docstrings for documentation visibility
- Include version information and describe current status
- Describe parameters in present tense, use directive to indicate deprecation status
#### Examples:
```python
class MyClass:
"""Class description.
# Regular class
class MyService(BaseService):
"""Description of what the service does.
Additional details about the class.
Provides detailed explanation of the service's functionality,
key features, and usage patterns.
Args:
param1: Description of first parameter.
param2: Description of second parameter.
Supported features:
- Feature one with detailed explanation
- Feature two with additional context
- Feature three for advanced use cases
"""
def __init__(self, param1, param2):
# No docstring required here as parameters are documented above
self.param1 = param1
self.param2 = param2
def __init__(self, param1: str, old_param: str = None, **kwargs):
"""Initialize the service.
Args:
param1: Description of param1.
old_param: Controls legacy behavior.
.. deprecated:: 1.2.0
This parameter no longer has any effect and will be removed in version 2.0.
**kwargs: Additional arguments passed to parent.
"""
if old_param is not None:
import warnings
warnings.warn(
"Parameter 'old_param' is deprecated and will be removed in version 2.0.",
DeprecationWarning,
)
super().__init__(**kwargs)
@property
def some_property(self) -> str:
"""Get the formatted property value.
def sample_rate(self) -> int:
"""Get the current sample rate.
Returns:
A string representation of the property.
The sample rate in Hz.
"""
return f"Property: {self.param1}"
return self._sample_rate
async def process_data(self, data: str) -> bool:
"""Process the provided data.
Args:
data: The data to process.
Returns:
True if processing succeeded.
"""
pass
# Dataclass with code examples
@dataclass
class MessageFrame:
"""Frame containing messages in OpenAI format.
Supports both simple and content list message formats.
Example::
[
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi there!"}
]
Parameters:
messages: List of messages in OpenAI format.
"""
messages: List[dict]
# Enum class
class Status(Enum):
"""Status codes for processing operations.
Parameters:
PENDING: Operation is queued but not started.
RUNNING: Operation is currently in progress.
COMPLETED: Operation finished successfully.
FAILED: Operation encountered an error.
"""
PENDING = "pending"
RUNNING = "running"
COMPLETED = "completed"
FAILED = "failed"
```
# Contributor Covenant Code of Conduct

View File

@@ -1,40 +0,0 @@
# setup
FROM python:3.11.5
WORKDIR /app
COPY requirements.txt /app
COPY *.py /app
COPY pyproject.toml /app
COPY src/ /app/src/
COPY examples/ /app/examples/
WORKDIR /app
RUN ls --recursive /app/
RUN pip3 install --upgrade -r requirements.txt
RUN python -m build .
RUN pip3 install .
RUN pip3 install gunicorn
# If running on Ubuntu, Azure TTS requires some extra config
# https://learn.microsoft.com/en-us/azure/ai-services/speech-service/quickstarts/setup-platform?pivots=programming-language-python&tabs=linux%2Cubuntu%2Cdotnetcli%2Cdotnet%2Cjre%2Cmaven%2Cnodejs%2Cmac%2Cpypi
RUN wget -O - https://www.openssl.org/source/openssl-1.1.1w.tar.gz | tar zxf -
WORKDIR openssl-1.1.1w
RUN ./config --prefix=/usr/local
RUN make -j $(nproc)
RUN make install_sw install_ssldirs
RUN ldconfig -v
ENV SSL_CERT_DIR=/etc/ssl/certs
#ENV LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
RUN apt clean
RUN apt-get update
RUN apt-get -y install build-essential libssl-dev ca-certificates libasound2 wget
ENV PYTHONUNBUFFERED=1
WORKDIR /app
EXPOSE 8000
# run
CMD ["gunicorn", "--workers=2", "--log-level", "debug", "--chdir", "examples/server", "--capture-output", "daily-bot-manager:app", "--bind=0.0.0.0:8000"]

4
MANIFEST.in Normal file
View File

@@ -0,0 +1,4 @@
prune docs
prune examples
prune scripts
prune tests

241
README.md
View File

@@ -2,12 +2,15 @@
<img alt="pipecat" width="300px" height="auto" src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png">
</div></h1>
[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![codecov](https://codecov.io/gh/pipecat-ai/pipecat/graph/badge.svg?token=LNVUIVO4Y9)](https://codecov.io/gh/pipecat-ai/pipecat) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat)
[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) ![Tests](https://github.com/pipecat-ai/pipecat/actions/workflows/tests.yaml/badge.svg) [![codecov](https://codecov.io/gh/pipecat-ai/pipecat/graph/badge.svg?token=LNVUIVO4Y9)](https://codecov.io/gh/pipecat-ai/pipecat) [![Docs](https://img.shields.io/badge/Documentation-blue)](https://docs.pipecat.ai) [![Discord](https://img.shields.io/discord/1239284677165056021)](https://discord.gg/pipecat) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/pipecat-ai/pipecat)
[![](https://getmanta.ai/api/badges?text=Manta%20Graph&link=manta)](https://getmanta.ai/pipecat)
# 🎙️ Pipecat: Real-Time Voice & Multimodal AI Agents
**Pipecat** is an open-source Python framework for building real-time voice and multimodal conversational agents. Orchestrate audio and video, AI services, different transports, and conversation pipelines effortlessly—so you can focus on what makes your agent unique.
> Want to dive right in? Try the [quickstart](https://docs.pipecat.ai/getting-started/quickstart).
## 🚀 What You Can Build
- **Voice Assistants** natural, streaming conversations with AI
@@ -17,8 +20,6 @@
- **Business Agents** customer intake, support bots, guided flows
- **Complex Dialog Systems** design logic with structured conversations
🧭 Looking to build structured conversations? Check out [Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) for managing complex conversational states and transitions.
## 🧠 Why Pipecat?
- **Voice-first**: Integrates speech recognition, text-to-speech, and conversation handling
@@ -26,170 +27,158 @@
- **Composable Pipelines**: Build complex behavior from modular components
- **Real-Time**: Ultra-low latency interaction with different transports (e.g. WebSockets or WebRTC)
## 🌐 Pipecat Ecosystem
### 📱 Client SDKs
Building client applications? You can connect to Pipecat from any platform using our official SDKs:
<a href="https://docs.pipecat.ai/client/js/introduction">JavaScript</a> | <a href="https://docs.pipecat.ai/client/react/introduction">React</a> | <a href="https://docs.pipecat.ai/client/react-native/introduction">React Native</a> |
<a href="https://docs.pipecat.ai/client/ios/introduction">Swift</a> | <a href="https://docs.pipecat.ai/client/android/introduction">Kotlin</a> | <a href="https://docs.pipecat.ai/client/c++/introduction">C++</a> | <a href="https://github.com/pipecat-ai/pipecat-esp32">ESP32</a>
### 🧭 Structured conversations
Looking to build structured conversations? Check out [Pipecat Flows](https://github.com/pipecat-ai/pipecat-flows) for managing complex conversational states and transitions.
### 🪄 Beautiful UIs
Want to build beautiful and engaging experiences? Checkout the [Voice UI Kit](https://github.com/pipecat-ai/voice-ui-kit), a collection of components, hooks and templates for building voice AI applications quickly.
### 🛠️ Create and deploy projects
Create a new project in under a minute with the [Pipecat CLI](https://github.com/pipecat-ai/pipecat-cli). Then use the CLI to monitor and deploy your agent to production.
### 🔍 Debugging
Looking for help debugging your pipeline and processors? Check out [Whisker](https://github.com/pipecat-ai/whisker), a real-time Pipecat debugger.
### 🖥️ Terminal
Love terminal applications? Check out [Tail](https://github.com/pipecat-ai/tail), a terminal dashboard for Pipecat.
### 📺️ Pipecat TV Channel
Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.youtube.com/playlist?list=PLzU2zoMTQIHjqC3v4q2XVSR3hGSzwKFwH) channel.
## 🎬 See it in action
<p float="left">
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/simple-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/storytelling-chatbot/image.png" width="400" /></a>
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/simple-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/storytelling-chatbot/image.png" width="400" /></a>
<br/>
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/translation-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/moondream-chatbot/image.png" width="400" /></a>
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/translation-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/12-describe-video.py"><img src="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/assets/moondream.png" width="400" /></a>
</p>
## 📱 Client SDKs
You can connect to Pipecat from any platform using our official SDKs:
| Platform | SDK Repo | Description |
| -------- | ------------------------------------------------------------------------------ | -------------------------------- |
| Web | [pipecat-client-web](https://github.com/pipecat-ai/pipecat-client-web) | JavaScript and React client SDKs |
| iOS | [pipecat-client-ios](https://github.com/pipecat-ai/pipecat-client-ios) | Swift SDK for iOS |
| Android | [pipecat-client-android](https://github.com/pipecat-ai/pipecat-client-android) | Kotlin SDK for Android |
| C++ | [pipecat-client-cxx](https://github.com/pipecat-ai/pipecat-client-cxx) | C++ client SDK |
## 🧩 Available services
| Category | Services |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Parakeet (NVIDIA)](https://docs.pipecat.ai/server/services/stt/parakeet), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [FastPitch (NVIDIA)](https://docs.pipecat.ai/server/services/tts/fastpitch), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Video | [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [Noisereduce](https://docs.pipecat.ai/server/utilities/audio/noisereduce-filter) |
| Analytics & Metrics | [Canonical AI](https://docs.pipecat.ai/server/services/analytics/canonical), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
| Category | Services |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [SambaNova (Whisper)](https://docs.pipecat.ai/server/services/stt/sambanova), [Sarvam](https://docs.pipecat.ai/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Ultravox](https://docs.pipecat.ai/server/services/stt/ultravox), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nim), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova) [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [PlayHT](https://docs.pipecat.ai/server/services/tts/playht), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai) |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Serializers | [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx) |
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/fal), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
## ⚡ Getting started
You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when youre ready.
You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when you're ready.
```shell
# Install the module
pip install pipecat-ai
1. Install uv
# Set up your environment
cp dot-env.template .env
```
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
To keep things lightweight, only the core framework is included by default. If you need support for third-party AI services, you can add the necessary dependencies with:
> **Need help?** Refer to the [uv install documentation](https://docs.astral.sh/uv/getting-started/installation/).
```shell
pip install "pipecat-ai[option,...]"
```
2. Install the module
```bash
# For new projects
uv init my-pipecat-app
cd my-pipecat-app
uv add pipecat-ai
# Or for existing projects
uv add pipecat-ai
```
3. Set up your environment
```bash
cp env.example .env
```
4. To keep things lightweight, only the core framework is included by default. If you need support for third-party AI services, you can add the necessary dependencies with:
```bash
uv add "pipecat-ai[option,...]"
```
> **Using pip?** You can still use `pip install pipecat-ai` and `pip install "pipecat-ai[option,...]"` to get set up.
## 🧪 Code examples
- [Foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) — small snippets that build on each other, introducing one or two concepts at a time
- [Example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) — complete applications that you can use as starting points for development
- [Example apps](https://github.com/pipecat-ai/pipecat-examples) — complete applications that you can use as starting points for development
## 🛠️ Hacking on the framework itself
## 🛠️ Contributing to the framework
1. Set up a virtual environment before following these instructions. From the root of the repo:
### Prerequisites
```shell
python3 -m venv venv
source venv/bin/activate
**Minimum Python Version:** 3.10
**Recommended Python Version:** 3.12
### Setup Steps
1. Clone the repository and navigate to it:
```bash
git clone https://github.com/pipecat-ai/pipecat.git
cd pipecat
```
2. Install the development dependencies:
2. Install development and testing dependencies:
```shell
pip install -r dev-requirements.txt
```bash
uv sync --group dev --all-extras \
--no-extra gstreamer \
--no-extra krisp \
--no-extra local \
--no-extra ultravox # (ultravox not fully supported on macOS)
```
3. Install the git pre-commit hooks (these help ensure your code follows project rules):
3. Install the git pre-commit hooks:
```shell
pre-commit install
```bash
uv run pre-commit install
```
4. Install the `pipecat-ai` package locally in editable mode:
```shell
pip install -e .
```
> The `-e` or `--editable` option allows you to modify the code without reinstalling.
5. Include optional dependencies as needed. For example:
```shell
pip install -e ".[daily,deepgram,cartesia,openai,silero]"
```
6. (Optional) If you want to use this package from another directory:
```shell
pip install "path_to_this_repo[option,...]"
```
> **Note**: Some extras (local, gstreamer) require system dependencies. See documentation if you encounter build errors.
### Running tests
Install the test dependencies:
To run all tests, from the root directory:
```shell
pip install -r test-requirements.txt
```bash
uv run pytest
```
From the root directory, run:
Run a specific test suite:
```shell
pytest
```bash
uv run pytest tests/test_name.py
```
### Setting up your editor
This project uses strict [PEP 8](https://peps.python.org/pep-0008/) formatting via [Ruff](https://github.com/astral-sh/ruff).
#### Emacs
You can use [use-package](https://github.com/jwiegley/use-package) to install [emacs-lazy-ruff](https://github.com/christophermadsen/emacs-lazy-ruff) package and configure `ruff` arguments:
```elisp
(use-package lazy-ruff
:ensure t
:hook ((python-mode . lazy-ruff-mode))
:config
(setq lazy-ruff-format-command "ruff format")
(setq lazy-ruff-check-command "ruff check --select I"))
```
`ruff` was installed in the `venv` environment described before, so you should be able to use [pyvenv-auto](https://github.com/ryotaro612/pyvenv-auto) to automatically load that environment inside Emacs.
```elisp
(use-package pyvenv-auto
:ensure t
:defer t
:hook ((python-mode . pyvenv-auto-run)))
```
#### Visual Studio Code
Install the
[Ruff](https://marketplace.visualstudio.com/items?itemName=charliermarsh.ruff) extension. Then edit the user settings (_Ctrl-Shift-P_ `Open User Settings (JSON)`) and set it as the default Python formatter, and enable formatting on save:
```json
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true
}
```
#### PyCharm
`ruff` was installed in the `venv` environment described before, now to enable autoformatting on save, go to `File` -> `Settings` -> `Tools` -> `File Watchers` and add a new watcher with the following settings:
1. **Name**: `Ruff formatter`
2. **File type**: `Python`
3. **Working directory**: `$ContentRoot$`
4. **Arguments**: `format $FilePath$`
5. **Program**: `$PyInterpreterDirectory$/ruff`
## 🤝 Contributing
We welcome contributions from the community! Whether you're fixing bugs, improving documentation, or adding new features, here's how you can help:

5
SECURITY.md Normal file
View File

@@ -0,0 +1,5 @@
# Security Policy
## Reporting a Vulnerability
Please email `disclosures@daily.co`.

View File

@@ -1,13 +0,0 @@
build~=1.2.2
coverage~=7.6.12
grpcio-tools~=1.67.1
pip-tools~=7.4.1
pre-commit~=4.0.1
pyright~=1.1.397
pytest~=8.3.4
pytest-asyncio~=0.25.3
pytest-aiohttp==1.1.0
ruff~=0.11.1
setuptools~=70.0.0
setuptools_scm~=8.1.0
python-dotenv~=1.0.1

View File

@@ -1,10 +0,0 @@
# Pipecat Docs
## [Architecture Overview](architecture.md)
Learn about the thinking behind the framework's design.
## [A Frame's Progress](frame-progress.md)
See how a Frame is processed through a Transport, a Pipeline, and a series of Frame Processors.

View File

@@ -1,10 +1,27 @@
#!/bin/bash
# Build docs using uv
echo "Installing dependencies with uv..."
uv sync --group docs --all-extras --no-extra krisp --no-extra gstreamer --no-extra ultravox --no-extra local_smart_turn --no-extra moondream --no-extra riva --no-extra mlx-whisper
# Check if sphinx-build is available
if ! uv run sphinx-build --version &> /dev/null; then
echo "Error: sphinx-build is not available" >&2
exit 1
fi
# Clean previous build
rm -rf _build
echo "Building documentation..."
# Build docs matching ReadTheDocs configuration
sphinx-build -b html -d _build/doctrees . _build/html -W --keep-going
uv run sphinx-build -b html -d _build/doctrees . _build/html -W --keep-going
# Open docs (MacOS)
open _build/html/index.html
if [ $? -eq 0 ]; then
echo "Documentation built successfully!"
# Open docs (MacOS)
open _build/html/index.html
else
echo "Documentation build failed!" >&2
exit 1
fi

View File

@@ -1,5 +1,7 @@
import logging
import os
import sys
from datetime import datetime
from pathlib import Path
# Configure logging
@@ -13,7 +15,8 @@ sys.path.insert(0, str(project_root / "src"))
# Project information
project = "pipecat-ai"
copyright = "2024, Daily"
current_year = datetime.now().year
copyright = f"2024-{current_year}, Daily" if current_year > 2024 else "2024, Daily"
author = "Daily"
# General configuration
@@ -24,107 +27,61 @@ extensions = [
"sphinx.ext.intersphinx",
]
suppress_warnings = [
"autodoc.mocked_object",
"toc.not_included",
]
# Napoleon settings
napoleon_google_docstring = True
napoleon_numpy_docstring = False
napoleon_include_init_with_doc = True
# AutoDoc settings
autodoc_default_options = {
"members": True,
"member-order": "bysource",
"special-members": "__init__",
"undoc-members": True,
"exclude-members": "__weakref__",
"no-index": True,
"undoc-members": False,
"exclude-members": "__weakref__,model_config",
"show-inheritance": True,
}
# Mock imports for optional dependencies
autodoc_mock_imports = [
"riva",
"livekit",
"pyht", # Base PlayHT package
"pyht.async_client", # PlayHT specific imports
"pyht.client",
"pyht.protos",
"pyht.protos.api_pb2",
"pipecat_ai_playht", # PlayHT wrapper
"aiortc",
"aiortc.mediastreams",
"cv2",
"av",
"pyneuphonic",
"mem0",
"mlx_whisper",
"anthropic",
"assemblyai",
"boto3",
"azure",
"cartesia",
"deepgram",
"elevenlabs",
"fal",
"gladia",
"google",
"krisp",
"langchain",
"lmnt",
"noisereduce",
"openai",
"openpipe",
"simli",
"soundfile",
# Krisp - has build issues on some platforms
"pipecat_ai_krisp",
"pyaudio",
"krisp",
"krisp_audio",
# System-specific GUI libraries
"_tkinter",
"tkinter",
"daily",
"daily_python",
"pydantic.BaseModel",
"pydantic.Field",
"pydantic._internal._model_construction",
"pydantic._internal._fields",
# Moondream dependencies
"torch",
"transformers",
"intel_extension_for_pytorch",
# Ultravox dependencies
"huggingface_hub",
# Platform-specific audio libraries (if needed)
"gi",
"gi.require_version",
"gi.repository",
# OpenCV - sometimes has import issues during docs build
"cv2",
# Heavy ML packages excluded from ReadTheDocs
# ultravox dependencies
"vllm",
"vllm.engine.arg_utils",
# local-smart-turn dependencies
"coremltools",
"coremltools.models",
"coremltools.models.MLModel",
"torch",
"torch.nn",
"torch.nn.functional",
"torchaudio",
# moondream dependencies
"transformers",
"transformers.AutoTokenizer",
# Langchain dependencies
"langchain_core",
"langchain_core.messages",
"langchain_core.runnables",
"langchain_core.messages.AIMessageChunk",
"langchain_core.runnables.Runnable",
# LiveKit dependencies
"livekit",
"livekit.rtc",
"livekit_api",
"livekit_protocol",
"tenacity",
"tenacity.retry",
"tenacity.stop_after_attempt",
"tenacity.wait_exponential",
"rtc",
"rtc.Room",
"rtc.RoomOptions",
"rtc.AudioSource",
"rtc.LocalAudioTrack",
"rtc.TrackPublishOptions",
"rtc.TrackSource",
"rtc.AudioStream",
"rtc.AudioFrameEvent",
"rtc.AudioFrame",
"rtc.Track",
"rtc.TrackKind",
"rtc.RemoteParticipant",
"rtc.RemoteTrackPublication",
"rtc.DataPacket",
# Riva dependencies
"transformers.AutoFeatureExtractor",
"AutoFeatureExtractor",
"timm",
"einops",
"intel_extension_for_pytorch",
"huggingface_hub",
# riva dependencies
"riva",
"riva.client",
"riva.client.Auth",
@@ -134,96 +91,45 @@ autodoc_mock_imports = [
"riva.client.AudioEncoding",
"riva.client.proto.riva_tts_pb2",
"riva.client.SpeechSynthesisService",
# Local CoreML Smart Turn dependencies
"coremltools",
"coremltools.models",
"coremltools.models.MLModel",
"torch",
"torch.nn",
"torch.nn.functional",
"transformers",
"transformers.AutoFeatureExtractor",
# Also add specific classes that are imported
"AutoFeatureExtractor",
# MLX dependencies (Apple Silicon specific)
"mlx",
"mlx_whisper", # Note: might need underscore format too
]
# HTML output settings
html_theme = "sphinx_rtd_theme"
html_static_path = ["_static"]
autodoc_typehints = "description"
html_static_path = ["_static"] if os.path.exists("_static") else []
autodoc_typehints = "signature" # Show type hints in the signature only, not in the docstring
html_show_sphinx = False
def verify_modules():
"""Verify that required modules are available."""
required_modules = {
"services": [
"assemblyai",
"aws",
"cartesia",
"deepgram",
"google",
"lmnt",
"riva",
"simli",
],
"serializers": ["livekit"],
"vad": ["silero", "vad_analyzer"],
"transports": {
"services": ["daily", "livekit"],
"local": ["audio", "tk"],
"network": ["fastapi_websocket", "websocket_server"],
},
}
def import_core_modules():
"""Import core pipecat modules for autodoc to discover."""
core_modules = [
"pipecat",
"pipecat.frames",
"pipecat.pipeline",
"pipecat.processors",
"pipecat.services",
"pipecat.transports",
"pipecat.audio",
"pipecat.adapters",
"pipecat.clocks",
"pipecat.metrics",
"pipecat.observers",
"pipecat.runner",
"pipecat.serializers",
"pipecat.sync",
"pipecat.transcriptions",
"pipecat.utils",
]
# Skip importing modules that are in autodoc_mock_imports
skipped_modules = set(autodoc_mock_imports)
missing = []
for category, modules in required_modules.items():
if isinstance(modules, dict):
# Handle nested structure
for subcategory, submodules in modules.items():
for module in submodules:
# Check if module is in autodoc_mock_imports
if (
f"pipecat.{category}.{subcategory}.{module}" in skipped_modules
or module in skipped_modules
):
logger.info(
f"Skipping import of mocked module: pipecat.{category}.{subcategory}.{module}"
)
continue
try:
__import__(f"pipecat.{category}.{subcategory}.{module}")
logger.info(
f"Successfully imported pipecat.{category}.{subcategory}.{module}"
)
except (ImportError, TypeError, NameError) as e:
missing.append(f"pipecat.{category}.{subcategory}.{module}")
logger.warning(
f"Optional module not available: pipecat.{category}.{subcategory}.{module} - {str(e)}"
)
else:
# Handle flat structure
for module in modules:
# Check if module is in autodoc_mock_imports
if f"pipecat.{category}.{module}" in skipped_modules or module in skipped_modules:
logger.info(f"Skipping import of mocked module: pipecat.{category}.{module}")
continue
try:
__import__(f"pipecat.{category}.{module}")
logger.info(f"Successfully imported pipecat.{category}.{module}")
except (ImportError, TypeError, NameError) as e:
missing.append(f"pipecat.{category}.{module}")
logger.warning(
f"Optional module not available: pipecat.{category}.{module} - {str(e)}"
)
if missing:
logger.warning(f"Some optional modules are not available: {missing}")
for module_name in core_modules:
try:
__import__(module_name)
logger.info(f"Successfully imported {module_name}")
except ImportError as e:
logger.warning(f"Failed to import {module_name}: {e}")
def clean_title(title: str) -> str:
@@ -235,36 +141,7 @@ def clean_title(title: str) -> str:
parts = title.split(".")
title = parts[-1]
# Special cases for service names and common acronyms
special_cases = {
"ai": "AI",
"aws": "AWS",
"api": "API",
"vad": "VAD",
"assemblyai": "AssemblyAI",
"deepgram": "Deepgram",
"elevenlabs": "ElevenLabs",
"openai": "OpenAI",
"openpipe": "OpenPipe",
"playht": "PlayHT",
"xtts": "XTTS",
"lmnt": "LMNT",
}
# Check if the entire title is a special case
if title.lower() in special_cases:
return special_cases[title.lower()]
# Otherwise, capitalize each word
words = title.split("_")
cleaned_words = []
for word in words:
if word.lower() in special_cases:
cleaned_words.append(special_cases[word.lower()])
else:
cleaned_words.append(word.capitalize())
return " ".join(cleaned_words)
return title
def setup(app):
@@ -289,9 +166,8 @@ def setup(app):
excludes = [
str(project_root / "src/pipecat/pipeline/to_be_updated"),
str(project_root / "src/pipecat/processors/gstreamer"),
str(project_root / "src/pipecat/services/to_be_updated"),
str(project_root / "src/pipecat/vad"), # deprecated
str(project_root / "src/pipecat/examples"),
str(project_root / "src/pipecat/tests"),
"**/test_*.py",
"**/tests/*.py",
]
@@ -332,5 +208,4 @@ def setup(app):
logger.error(f"Error generating API documentation: {e}", exc_info=True)
# Run module verification
verify_modules()
import_core_modules()

View File

@@ -1,81 +1,36 @@
Pipecat API Reference Docs
==========================
Pipecat API Reference
=====================
Welcome to Pipecat's API reference documentation!
Welcome to the Pipecat API reference.
Pipecat is an open source framework for building voice and multimodal assistants.
It provides a flexible pipeline architecture for connecting various AI services,
audio processing, and transport layers.
Use the navigation on the left to browse modules, or search using the search box.
**New to Pipecat?** Check out the `main documentation <https://docs.pipecat.ai>`_ for tutorials, guides, and client SDK information.
Quick Links
-----------
* `GitHub Repository <https://github.com/pipecat-ai/pipecat>`_
* `Website <https://pipecat.ai>`_
API Reference
-------------
Core Components
~~~~~~~~~~~~~~~
* :mod:`Frames <pipecat.frames>`
* :mod:`Processors <pipecat.processors>`
* :mod:`Pipeline <pipecat.pipeline>`
Audio Processing
~~~~~~~~~~~~~~~~
* :mod:`Audio <pipecat.audio>`
Services
~~~~~~~~
* :mod:`Services <pipecat.services>`
Transport & Serialization
~~~~~~~~~~~~~~~~~~~~~~~~~
* :mod:`Transports <pipecat.transports>`
* :mod:`Local <pipecat.transports.local>`
* :mod:`Network <pipecat.transports.network>`
* :mod:`Services <pipecat.transports.services>`
* :mod:`Serializers <pipecat.serializers>`
Utilities
~~~~~~~~~
* :mod:`Adapters <pipecat.adapters>`
* :mod:`Clocks <pipecat.clocks>`
* :mod:`Metrics <pipecat.metrics>`
* :mod:`Observers <pipecat.observers>`
* :mod:`Sync <pipecat.sync>`
* :mod:`Transcriptions <pipecat.transcriptions>`
* :mod:`Utils <pipecat.utils>`
* `Join our Community <https://discord.gg/pipecat>`_
.. toctree::
:maxdepth: 3
:maxdepth: 2
:caption: API Reference
:hidden:
Adapters <api/pipecat.adapters>
Audio <api/pipecat.audio>
Clocks <api/pipecat.clocks>
Extensions <api/pipecat.extensions>
Frames <api/pipecat.frames>
Metrics <api/pipecat.metrics>
Observers <api/pipecat.observers>
Pipeline <api/pipecat.pipeline>
Processors <api/pipecat.processors>
Runner <api/pipecat.runner>
Serializers <api/pipecat.serializers>
Services <api/pipecat.services>
Sync <api/pipecat.sync>
Transcriptions <api/pipecat.transcriptions>
Transports <api/pipecat.transports>
Utils <api/pipecat.utils>
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
Utils <api/pipecat.utils>

View File

@@ -1,54 +0,0 @@
# Sphinx dependencies
sphinx>=8.1.3
sphinx-rtd-theme
sphinx-markdown-builder
sphinx-autodoc-typehints
toml
# Install all extras individually to ensure they're properly resolved
pipecat-ai[anthropic]
pipecat-ai[assemblyai]
pipecat-ai[aws]
pipecat-ai[azure]
pipecat-ai[canonical]
pipecat-ai[cartesia]
pipecat-ai[cerebras]
pipecat-ai[deepseek]
pipecat-ai[daily]
pipecat-ai[deepgram]
pipecat-ai[elevenlabs]
pipecat-ai[fal]
pipecat-ai[fireworks]
pipecat-ai[fish]
pipecat-ai[gladia]
pipecat-ai[google]
pipecat-ai[grok]
pipecat-ai[groq]
# pipecat-ai[krisp] # Mocked
pipecat-ai[koala]
# pipecat-ai[langchain] # Mocked
# pipecat-ai[livekit] # Mocked
pipecat-ai[lmnt]
pipecat-ai[local]
# pipecat-ai[local-smart-turn] # Mocked
# pipecat-ai[mem0] # Mocked
# pipecat-ai[mlx-whisper] # Mocked
# pipecat-ai[moondream] # Mocked
pipecat-ai[nim]
# pipecat-ai[neuphonic] # Mocked
pipecat-ai[noisereduce]
pipecat-ai[openai]
# pipecat-ai[openpipe]
# pipecat-ai[playht] # Mocked due to grpcio conflict with riva
pipecat-ai[qwen]
pipecat-ai[remote-smart-turn]
# pipecat-ai[riva] # Mocked
pipecat-ai[silero]
pipecat-ai[simli]
pipecat-ai[soundfile]
pipecat-ai[tavus]
pipecat-ai[together]
# pipecat-ai[ultravox] # Mocked
# pipecat-ai[webrtc] # Mocked
pipecat-ai[websocket]
pipecat-ai[whisper]

View File

@@ -1,17 +0,0 @@
# Pipecat architecture guide
## Frames
Frames can represent discrete chunks of data, for instance a chunk of text, a chunk of audio, or an image. They can also be used to as control flow, for instance a frame that indicates that there is no more data available, or that a user started or stopped talking. They can also represent more complex data structures, such as a message array used for an LLM completion.
## FrameProcessors
Frame processors operate on frames. Every frame processor implements a `process_frame` method that consumes one frame and produces zero or more frames. Frame processors can do simple transforms, such as concatenating text fragments into sentences, or they can treat frames as input for an AI Service, and emit chat completions based on message arrays or transform text into audio or images.
## Pipelines
Pipelines are lists of frame processors linked together. Frame processors can push frames upstream or downstream to their peers. A very simple pipeline might chain an LLM frame processor to a text-to-speech frame processor, with a transport as an output.
## Transports
Transports provide input and output frame processors to receive or send frames respectively. For example, the `DailyTransport` does this with a WebRTC session joined to a Daily.co room.

View File

@@ -1,46 +0,0 @@
# A Frame's Progress
1. A user says “Hello, LLM” and the cloud transcription service delivers a transcription to the Transport.
![A transcript frame arrives](images/frame-progress-01.png)
2. The Transport places a Transcription frame in the Pipelines source queue.
![Frame in source queue](images/frame-progress-02.png)
3. The Pipeline passes the Transcription frame to the first Frame Processor in its list, the LLM User Message Aggregator.
![To UMA](images/frame-progress-03.png)
4. The LLM User Message Aggregator updates the LLM Context with a `{“user”: “Hello LLM”}` message.
![Update context](images/frame-progress-04.png)
5. The LLM User Message Aggregator yields an LLM Message Frame, containing the updated LLM Context. The Pipeline passes this frame to the LLM Frame Processor.
![Update context](images/frame-progress-05.png)
6. The LLM Frame Processor creates a streaming chat completion based on the LLM context and yields the first chunk of a response, Text Frame with the value “Hi, “. The Pipeline passes this frame to the TTS Frame Processor. The TTS Frame Processor aggregates this response but doesnt yield anything, yet, because its waiting for a full sentence.
![LLM yields Text](images/frame-progress-06.png)
7. The LLM Frame Processor yields another Text Frame with the value “there.”. The Pipeline passes this frame to the TTS Frame Processor.
![LLM yields more Text](images/frame-progress-07.png)
8. The TTS Frame Processor now has a full sentence, so it starts streaming audio based on “Hi, there.” It yields the first chunk of streaming audio as an Audio frame, which the Pipeline passes to the LLM Assistant Message Aggregator.
![TTS yields Audio](images/frame-progress-08.png)
9. The LLM Assistant Message Aggregator doesnt do anything with Audio frames, so it immediately yields the frame, unchanged. This is the convention for all Frame Processors: frames that the processor doesnt process should be immediately yielded.
![pass-through](images/frame-progress-09.png)
10. The Pipeline places the first Audio frame in its sink queue, which is being watched by the Transport. Since the frame is now in a queue, the Pipeline can continue processing other frames. Note that the source and sink queues form a sort of “boundary of concurrent processing” between a Pipeline and the outside world. In a Pipeline, Frames are processed sequentially; once a Frame is on a queue it can be processed in parallel with the frames being processed by the Pipeline. TODO: link to a more in-depth section about this.
![sink queue](images/frame-progress-10.png)
11. The TTS Frame Processor yields another Audio frame as the Transport transmits the first Audio frame.
![parallel audio](images/frame-progress-11.png)
12. As before, the LLM Assistant Message Aggregator immediately yields the Audio frame and the Pipeline places the Audio frame in the sink queue.
![sink queue 2](images/frame-progress-12.png)
13. The TTS Frame Processor has no more frames to yield. The LLM Frame Processor emits an LLM Response End Frame, which the Pipeline passes to the TTS Frame Processor.
![response end](images/frame-progress-13.png)
14. The TTS Frame Processor immediately yields the LLM Response End Frame, so the Pipeline passes it along to the LLM Assistant Message Aggregator. The LLM Assistant Message Aggregator updates the LLM Context with the full response from the LLM. TODO TODO: I realized I forgot that the TSS Frame Processor also yields the Text frames that the LLM emitted so that the LLM Assistant Message Aggregator could accumulate them, arrggh.
![response end](images/frame-progress-14.png)
15. The system is quiet, and waiting for the next message from the Transport.
![response end](images/frame-progress-15.png)

View File

@@ -1,110 +0,0 @@
# Understanding Different Frame Types in the Pipecat System
In the Pipecat system, frames are used to represent different types of data and control signals that flow through the pipeline. Understanding these frame types is crucial for working with the system effectively. This tutorial will cover the main categories of frames and their specific uses.
## 1. Base Frame Classes
### Frame
The `Frame` class is the base class for all frames. It includes:
- `id`: A unique identifier
- `name`: A descriptive name
- `pts`: Presentation timestamp (optional)
### DataFrame
`DataFrame` is a subclass of `Frame` and serves as a base for most data-carrying frames.
## 2. Audio Frames
### AudioRawFrame
Represents a chunk of audio with properties:
- `audio`: Raw audio data
- `sample_rate`: Audio sample rate
- `num_channels`: Number of audio channels
Subclasses include:
- `InputAudioRawFrame`: For audio from input sources
- `OutputAudioRawFrame`: For audio to be played by output devices
- `TTSAudioRawFrame`: For audio generated by Text-to-Speech services
## 3. Image Frames
### ImageRawFrame
Represents an image with properties:
- `image`: Raw image data
- `size`: Image dimensions
- `format`: Image format (e.g., JPEG, PNG)
Subclasses include:
- `InputImageRawFrame`: For images from input sources
- `OutputImageRawFrame`: For images to be displayed
- `UserImageRawFrame`: For images associated with a specific user
- `VisionImageRawFrame`: For images with associated text for description
- `URLImageRawFrame`: For images with an associated URL
### SpriteFrame
Represents an animated sprite, containing a list of `ImageRawFrame` objects.
## 4. Text and Transcription Frames
### TextFrame
Represents a chunk of text, used for various purposes in the pipeline.
### TranscriptionFrame
A specialized `TextFrame` for speech transcriptions, including:
- `user_id`: ID of the speaking user
- `timestamp`: When the transcription was generated
- `language`: Detected language of the speech
### InterimTranscriptionFrame
Similar to `TranscriptionFrame`, but for interim (not final) transcriptions.
## 5. LLM (Language Model) Frames
### LLMMessagesFrame
Contains a list of messages for an LLM service to process.
### LLMMessagesAppendFrame and LLMMessagesUpdateFrame
Used to modify the current context of LLM messages.
### LLMSetToolsFrame
Specifies tools (functions) available for the LLM to use.
### LLMEnablePromptCachingFrame
Controls prompt caching in certain LLMs.
## 6. System and Control Frames
### SystemFrame
Base class for system-level frames.
Important system frames include:
- `StartFrame`: Initiates a pipeline
- `CancelFrame`: Stops a pipeline immediately
- `ErrorFrame`: Notifies of errors (with `FatalErrorFrame` for unrecoverable errors)
- `EndTaskFrame` and `CancelTaskFrame`: Control pipeline tasks
- `StartInterruptionFrame` and `StopInterruptionFrame`: Indicate user speech for interruptions
### ControlFrame
Base class for control-flow frames.
Notable control frames:
- `EndFrame`: Signals the end of a pipeline
- `LLMFullResponseStartFrame` and `LLMFullResponseEndFrame`: Bracket LLM responses
- `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame`: Indicate user speech activity
- `BotStartedSpeakingFrame` and `BotStoppedSpeakingFrame`: Indicate bot speech activity
- `TTSStartedFrame` and `TTSStoppedFrame`: Bracket Text-to-Speech responses
## 7. Special Purpose Frames
### MetricsFrame
Contains performance metrics data.
### FunctionCallInProgressFrame and FunctionCallResultFrame
Used for handling LLM function (tool) calls.
### ServiceUpdateSettingsFrame
Base class for updating service settings, with specific subclasses for LLM, TTS, and STT services.
## Conclusion
Understanding these frame types is essential for working with the Pipecat system. Each frame type serves a specific purpose in the pipeline, whether it's carrying data (like audio or images), controlling the flow of the pipeline, or managing system-level operations. By using the appropriate frame types, you can effectively process and transmit various kinds of information through your pipeline.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 110 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 111 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 117 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 98 KiB

View File

@@ -1,103 +0,0 @@
# Anthropic
ANTHROPIC_API_KEY=...
# AWS
AWS_SECRET_ACCESS_KEY=...
AWS_ACCESS_KEY_ID=...
AWS_REGION=...
# Azure
AZURE_SPEECH_REGION=...
AZURE_SPEECH_API_KEY=...
AZURE_CHATGPT_API_KEY=...
AZURE_CHATGPT_ENDPOINT=https://...
AZURE_CHATGPT_MODEL=...
AZURE_DALLE_API_KEY=...
AZURE_DALLE_ENDPOINT=https://...
AZURE_DALLE_MODEL=...
# Cartesia
CARTESIA_API_KEY=...
# Daily
DAILY_API_KEY=...
DAILY_SAMPLE_ROOM_URL=https://...
# ElevenLabs
ELEVENLABS_API_KEY=...
ELEVENLABS_VOICE_ID=...
# Neuphonic
NEUPHONIC_API_KEY=...
# Fal
FAL_KEY=...
# Fireworks
FIREWORKS_API_KEY=...
# Gladia
GLADIA_API_KEY=...
# LMNT
LMNT_API_KEY=...
LMNT_VOICE_ID=...
# PlayHT
PLAY_HT_USER_ID=...
PLAY_HT_API_KEY=...
# OpenAI
OPENAI_API_KEY=...
# OpenPipe
OPENPIPE_API_KEY=...
# Tavus
TAVUS_API_KEY=...
TAVUS_REPLICA_ID=...
TAVUS_PERSONA_ID=...
# Simli
SIMLI_API_KEY=...
SIMLI_FACE_ID=...
# Krisp
KRISP_MODEL_PATH=...
# DeepSeek
DEEPSEEK_API_KEY=...
# Groq
GROQ_API_KEY=...
# Grok
GROK_API_KEY=...
# Together.ai
TOGETHER_API_KEY=...
# Cerebras
CEREBRAS_API_KEY=...
# Fish Audio
FISH_API_KEY=...
# Assembly AI
ASSEMBLYAI_API_KEY=...
# OpenRouter
OPENROUTER_API_KEY=...
# Piper
PIPER_BASE_URL=...
# Smart turn
LOCAL_SMART_TURN_MODEL_PATH=
FAL_SMART_TURN_API_KEY=...
# Twilio
TWILIO_ACCOUNT_SID=
TWILIO_AUTH_TOKEN=

194
env.example Normal file
View File

@@ -0,0 +1,194 @@
# AI-COUSTICS
AICOUSTICS_LICENSE_KEY=...
# Anthropic
ANTHROPIC_API_KEY=...
# Assembly AI
ASSEMBLYAI_API_KEY=...
# Async
ASYNCAI_API_KEY=...
ASYNCAI_VOICE_ID=...
# AWS
AWS_SECRET_ACCESS_KEY=...
AWS_ACCESS_KEY_ID=...
AWS_REGION=...
# Azure
AZURE_SPEECH_REGION=...
AZURE_SPEECH_API_KEY=...
AZURE_CHATGPT_API_KEY=...
AZURE_CHATGPT_ENDPOINT=https://...
AZURE_CHATGPT_MODEL=...
AZURE_REALTIME_API_KEY=...
AZURE_REALTIME_BASE_URL=...
AZURE_DALLE_API_KEY=...
AZURE_DALLE_ENDPOINT=https://...
AZURE_DALLE_MODEL=...
# Cartesia
CARTESIA_API_KEY=...
CARTESIA_VOICE_ID=...
# Cerebras
CEREBRAS_API_KEY=...
# Daily
DAILY_API_KEY=...
DAILY_SAMPLE_ROOM_URL=https://...
# Deepgram
DEEPGRAM_API_KEY=...
SAGEMAKER_ENDPOINT_NAME=...
# DeepSeek
DEEPSEEK_API_KEY=...
# ElevenLabs
ELEVENLABS_API_KEY=...
ELEVENLABS_VOICE_ID=...
# Fal
FAL_KEY=...
# Fireworks
FIREWORKS_API_KEY=...
# Fish Audio
FISH_API_KEY=...
# Gladia
GLADIA_API_KEY=...
GLADIA_REGION=...
# Google
GOOGLE_API_KEY=...
GOOGLE_VERTEX_TEST_CREDENTIALS=...
GOOGLE_CLOUD_PROJECT_ID=...
GOOGLE_CLOUD_LOCATION=...
GOOGLE_TEST_CREDENTIALS=...
# Grok
GROK_API_KEY=...
# Groq
GROQ_API_KEY=...
# Heygen
HEYGEN_API_KEY=...
# Hume
HUME_API_KEY=...
HUME_VOICE_ID=...
# Inworld
INWORLD_API_KEY=...
# Krisp
KRISP_MODEL_PATH=...
# Krisp Viva
KRISP_VIVA_MODEL_PATH=...
# LiveKit
LIVEKIT_API_KEY=...
LIVEKIT_API_SECRET=...
# LMNT
LMNT_API_KEY=...
LMNT_VOICE_ID=...
# MiniMax
MINIMAX_API_KEY=...
MINIMAX_GROUP_ID=...
# Mistral
MISTRAL_API_KEY=...
# Neuphonic
NEUPHONIC_API_KEY=...
# NVIDIA
NVIDIA_API_KEY=...
# OpenAI
OPENAI_API_KEY=...
# OpenPipe
OPENPIPE_API_KEY=...
# OpenRouter
OPENROUTER_API_KEY=...
# Perplexity
PERPLEXITY_API_KEY=...
# Picovoice Koala
KOALA_ACCESS_KEY=...
# Piper
PIPER_BASE_URL=...
# PlayHT
PLAYHT_USER_ID=...
PLAYHT_API_KEY=...
# Plivo
PLIVO_AUTH_ID=...
PLIVO_AUTH_TOKEN=...
# Qwen
QWEN_API_KEY=...
# Rime
RIME_API_KEY=...
RIME_VOICE_ID=...
# SambaNova
SAMBANOVA_API_KEY=...
# Sarvam AI
SARVAM_API_KEY=...
# Sentry
SENTRY_DSN=...
# Simli
SIMLI_API_KEY=...
SIMLI_FACE_ID=...
# Smart turn
LOCAL_SMART_TURN_MODEL_PATH=...
FAL_SMART_TURN_API_KEY=...
# Soniox
SONIOX_API_KEY=...
# Speechmatics
SPEECHMATICS_API_KEY=...
# Tavus
TAVUS_API_KEY=...
TAVUS_REPLICA_ID=...
# Telnyx
TELNYX_API_KEY=...
TELNYX_ACCOUNT_SID=...
# Together.ai
TOGETHER_API_KEY=...
# Twilio
TWILIO_ACCOUNT_SID=...
TWILIO_AUTH_TOKEN=...
# WhatsApp
WHATSAPP_TOKEN=...
WHATSAPP_WEBHOOK_VERIFICATION_TOKEN=...
WHATSAPP_PHONE_NUMBER_ID=...
WHATSAPP_APP_SECRET=...

View File

@@ -1,88 +1,31 @@
# Pipecat Examples
This directory contains examples to help you learn how to build with Pipecat.
# Pipecat &mdash; Examples
## Getting Started
## Foundational snippets
Small snippets that build on each other, introducing one or two concepts at a time.
New to Pipecat? Start here:
➡️ [Take a look](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational)
- **[Quickstart](quickstart/)** - Get your first voice AI bot running in 5 minutes _(coming soon)_
- **[Client/Server Web](client-server-web/)** - Learn to build web applications with Pipecat's client SDKs _(coming soon)_
- **[Phone Bot with Twilio](phone-bot-twilio/)** - Connect your bot to a phone number _(coming soon)_
## Chatbot examples
Collection of self-contained real-time voice and video AI demo applications built with Pipecat.
## Foundational Examples
### Quickstart
Single-file examples that introduce core Pipecat concepts one at a time. These examples:
Each project has its own set of dependencies and configuration variables. They intentionally avoids shared code across projects &mdash; you can grab whichever demo folder you want to work with as a starting point.
- Build on each other progressively
- Focus on specific features or integrations
- Are used for testing with every Pipecat release
We recommend you start with a virtual environment:
See the **[Foundational Examples README](foundational/)** for the complete list.
```shell
cd pipecat-ai/examples/simple-chatbot
## More Advanced Examples
python -m venv venv
Ready to explore complex use cases? Visit **[pipecat-examples](https://github.com/pipecat-ai/pipecat-examples)** for:
source venv/bin/activate
pip install -r requirements.txt
```
Next, follow the steps in the README for each demo.
Make sure you `pip install -r requirements.txt` for each demo project, so you can be sure to have the necessary service dependencies that extend the functionality of Pipecat. You can read more about the framework architecture [here](https://github.com/pipecat-ai/pipecat/tree/main/docs).
## Projects:
| Project | Description | Services |
|----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------|
| [Simple Chatbot](simple-chatbot) | Basic voice-driven conversational bot. A good starting point for learning the flow of the framework. | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI |
| [Storytelling Chatbot](storytelling-chatbot) | Stitches together multiple third-party services to create a collaborative storytime experience. | Deepgram, ElevenLabs, OpenAI, Fal, Daily, Custom UI |
| [Translation Chatbot](translation-chatbot) | Listens for user speech, then translates that speech to Spanish and speaks the translation back. Demonstrates multi-participant use-cases. | Deepgram, Azure, OpenAI, Daily, Daily Prebuilt UI |
| [Moondream Chatbot](moondream-chatbot) | Demonstrates how to add vision capabilities to GPT4. **Note: works best with a GPU** | Deepgram, ElevenLabs, OpenAI, Moondream, Daily, Daily Prebuilt UI |
| [Patient intake](patient-intake) | A chatbot that can call functions in response to user input. | Deepgram, ElevenLabs, OpenAI, Daily, Daily Prebuilt UI |
| [Phone Chatbot](phone-chatbot) | A chatbot that connects to PSTN/SIP phone calls, powered by Daily or Twilio. | Deepgram, ElevenLabs, OpenAI, Daily, Twilio |
| [Twilio Chatbot](twilio-chatbot) | A chatbot that connects to an incoming phone call from Twilio. | Deepgram, ElevenLabs, OpenAI, Daily, Twilio |
| [studypal](studypal) | A chatbot to have a conversation about any article on the web | |
| [WebSocket Chatbot Server](websocket-server) | A real-time websocket server that handles audio streaming and bot interactions with speech-to-text and text-to-speech capabilities. | Cartesia, Deepgram, OpenAI, Websockets |
> [!IMPORTANT]
> These example projects use Daily as a WebRTC transport and can be joined using their hosted Prebuilt UI.
> It provides a quick way to join a real-time session with your bot and test your ideas without building any frontend code. If you'd like to see an example of a custom UI, try Storybot.
## FAQ
### Deployment
For each of these demos we've included a `Dockerfile`. Out of the box, this should provide everything needed to get the respective demo running on a VM:
```shell
docker build username/app:tag .
docker run -p 7860:7860 --env-file ./.env username/app:tag
docker push ...
```
### SSL
If you're working with a custom UI (such as with the Storytelling Chatbot), it's important to ensure your deployment platform supports HTTPS, as accessing user devices such as mics and webcams requires SSL.
If you try to run a custom UI without SSL, you may see an error in the console telling you that `navigator` is undefined, or no devices are available.
### Are these examples production ready?
Yes, kind of.
These demos attempt to keep things simple and are unopinionated regarding environment or scalability.
We're using FastAPI to spawn a subprocess for the bots / agents &mdash; useful for small tests, but not so great for production grade apps with many concurrent users. You can see how this works in each project's `start` endpoint in `server.py`.
Creating virtualized worker pools and on-demand instances is out of scope for these examples, but we hope to add some examples to this repo soon!
For projects that have CUDA as a requirement, such as Moondream Chatbot, be sure to deploy to a GPU-powered platform (such as [fly.io](https://fly.io) or [Runpod](https://runpod.io).)
## Getting help
➡️ [Join our Discord](https://discord.gg/pipecat)
➡️ [Reach us on Twitter](https://x.com/pipecat_ai)
- Production-ready applications
- Multi-platform client implementations
- Telephony integrations
- Multimodal and creative applications
- Deployment and monitoring examples

View File

@@ -1,45 +0,0 @@
# Bot ready signaling
A simple Pipecat example demonstrating how to handle signaling between the client and the bot,
ensuring that the bot starts sending audio only when the client is available,
thereby avoiding the risk of cutting off the beginning of the audio.
## Quick Start
### First, start the bot server:
1. Navigate to the server directory:
```bash
cd server
```
2. Create and activate a virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install requirements:
```bash
pip install -r requirements.txt
```
4. Copy env.example to .env and configure:
- Add your API keys
5. Start the server:
```bash
python server.py
```
### Next, connect using the client app:
For client-side setup, refer to the [JavaScript Guide](client/javascript/README.md).
## Important Note
Ensure the bot server is running before using any client implementations.
## Requirements
- Python 3.10+
- Node.js 16+ (for JavaScript)
- Daily API key
- Cartesia API key
- Modern web browser with WebRTC support

View File

@@ -1,27 +0,0 @@
# JavaScript Implementation
Basic implementation using the [Pipecat JavaScript SDK](https://docs.pipecat.ai/client/js/introduction).
## Setup
1. Run the bot server. See the [server README](../../README).
2. Navigate to the `client/javascript` directory:
```bash
cd client/javascript
```
3. Install dependencies:
```bash
npm install
```
4. Run the client app:
```
npm run dev
```
5. Visit http://localhost:5173 in your browser.

View File

@@ -1,34 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AI Chatbot</title>
</head>
<body>
<div class="container">
<div class="status-bar">
<div class="status">
Status: <span id="connection-status">Disconnected</span>
</div>
<div class="controls">
<button id="connect-btn">Connect</button>
<button id="disconnect-btn" disabled>Disconnect</button>
</div>
</div>
<audio id="bot-audio" autoplay></audio>
<div class="debug-panel">
<h3>Debug Info</h3>
<div id="debug-log"></div>
</div>
</div>
<script type="module" src="/src/app.js"></script>
<link rel="stylesheet" href="/src/style.css">
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -1,20 +0,0 @@
{
"name": "client",
"version": "1.0.0",
"main": "index.js",
"scripts": {
"dev": "vite",
"build": "vite build",
"preview": "vite preview"
},
"keywords": [],
"author": "",
"license": "ISC",
"description": "",
"devDependencies": {
"vite": "^6.0.9"
},
"dependencies": {
"@daily-co/daily-js": "0.74.0"
}
}

View File

@@ -1,216 +0,0 @@
/**
* Copyright (c) 20242025, Daily
*
* SPDX-License-Identifier: BSD 2-Clause License
*/
import Daily from "@daily-co/daily-js";
/**
* ChatbotClient handles the connection and media management for a real-time
* voice interaction with an AI bot.
*/
class ChatbotClient {
constructor() {
// Initialize client state
this.dailyCallObject = null;
this.setupDOMElements();
this.setupEventListeners();
}
/**
* Set up references to DOM elements and create necessary media elements
*/
setupDOMElements() {
// Get references to UI control elements
this.connectBtn = document.getElementById('connect-btn');
this.disconnectBtn = document.getElementById('disconnect-btn');
this.statusSpan = document.getElementById('connection-status');
this.debugLog = document.getElementById('debug-log');
// Create an audio element for bot's voice output
this.botAudio = document.createElement('audio');
this.botAudio.autoplay = true;
this.botAudio.playsInline = true;
document.body.appendChild(this.botAudio);
}
/**
* Set up event listeners for connect/disconnect buttons
*/
setupEventListeners() {
this.connectBtn.addEventListener('click', () => this.connect());
this.disconnectBtn.addEventListener('click', () => this.disconnect());
}
/**
* Add a timestamped message to the debug log
*/
log(message) {
const entry = document.createElement('div');
entry.textContent = `${new Date().toISOString()} - ${message}`;
// Add styling based on message type
if (message.startsWith('User: ')) {
entry.style.color = '#2196F3'; // blue for user
} else if (message.startsWith('Bot: ')) {
entry.style.color = '#4CAF50'; // green for bot
}
this.debugLog.appendChild(entry);
this.debugLog.scrollTop = this.debugLog.scrollHeight;
console.log(message);
}
/**
* Update the connection status display
*/
updateStatus(status) {
this.statusSpan.textContent = status;
this.log(`Status: ${status}`);
}
handleEventToConsole (evt) {
this.log(`Received event: ${evt.action}`);
};
/**
* Set up listeners for track events (start/stop)
* This handles new tracks being added during the session
*/
setupTrackListeners() {
if (!this.dailyCallObject) return;
this.dailyCallObject.on("joined-meeting", () => {
this.updateStatus('Connected');
this.connectBtn.disabled = true;
this.disconnectBtn.disabled = false;
this.log('Client connected');
});
this.dailyCallObject.on("track-started", (evt) => {
if (evt.track.kind === "audio" && evt.participant.local === false) {
this.log("Audio track started.")
this.setupAudioTrack(evt.track);
}
});
this.dailyCallObject.on("track-stopped", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-joined", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-updated", this.handleEventToConsole.bind(this));
this.dailyCallObject.on("participant-left", () => {
// When the bot leaves, we are also disconnecting from the call
this.disconnect()
});
this.dailyCallObject.on("left-meeting", () => {
this.updateStatus('Disconnected');
this.connectBtn.disabled = false;
this.disconnectBtn.disabled = true;
this.log('Client disconnected');
});
this.dailyCallObject.on("error", this.handleEventToConsole.bind(this));
}
/**
* Set up an audio track for playback
* Handles both initial setup and track updates
*/
setupAudioTrack(track) {
this.log(`Setting up audio track, track state: ${track.readyState}, muted: ${track.muted}`);
// Check if we're already playing this track
if (this.botAudio.srcObject) {
const oldTrack = this.botAudio.srcObject.getAudioTracks()[0];
if (oldTrack?.id === track.id) return;
}
// Create a new MediaStream with the track and set it as the audio source
this.botAudio.srcObject = new MediaStream([track]);
this.botAudio.onplaying = async (event) => {
this.log("onplaying")
this.log("Will send the audio message to play the audio at the next tick")
this.dailyCallObject.sendAppMessage("playable")
}
}
async fetchRoomInfo() {
let connectUrl = '/connect'
let res = await fetch(connectUrl, {
method: "POST",
mode: "cors",
headers: new Headers({
"Content-Type": "application/json"
}),
})
if (res.ok) {
return res.json();
}
}
/**
* Initialize and connect to the bot
* This sets up the RTVI client, initializes devices, and establishes the connection
*/
async connect() {
try {
// Initialize the client
this.dailyCallObject = Daily.createCallObject({
subscribeToTracksAutomatically: true,
});
// Set up listeners for media track events
this.setupTrackListeners();
this.log('Creating the bot...');
let roomInfo = await this.fetchRoomInfo()
// Connect to the bot
this.log('Connecting to bot...');
// Only for making debugger easier
window.callObject = this.dailyCallObject;
await this.dailyCallObject.join({
url: roomInfo.room_url,
});
this.log('Connection complete');
} catch (error) {
// Handle any errors during connection
this.log(`Error connecting: ${error.message}`);
this.log(`Error stack: ${error.stack}`);
this.updateStatus('Error');
// Clean up if there's an error
if (this.dailyCallObject) {
try {
await this.dailyCallObject.leave();
} catch (disconnectError) {
this.log(`Error during disconnect: ${disconnectError.message}`);
}
}
}
}
/**
* Disconnect from the bot and clean up media resources
*/
async disconnect() {
if (this.dailyCallObject) {
try {
// Disconnect the RTVI client
await this.dailyCallObject.leave();
await this.dailyCallObject.destroy();
this.dailyCallObject = null;
// Clean up audio
if (this.botAudio.srcObject) {
this.botAudio.srcObject.getTracks().forEach((track) => track.stop());
this.botAudio.srcObject = null;
}
} catch (error) {
this.log(`Error disconnecting: ${error.message}`);
}
}
}
}
// Initialize the client when the page loads
window.addEventListener('DOMContentLoaded', () => {
new ChatbotClient();
});

View File

@@ -1,98 +0,0 @@
body {
margin: 0;
padding: 20px;
font-family: Arial, sans-serif;
background-color: #f0f0f0;
}
.container {
max-width: 1200px;
margin: 0 auto;
}
.status-bar {
display: flex;
justify-content: space-between;
align-items: center;
padding: 10px;
background-color: #fff;
border-radius: 8px;
margin-bottom: 20px;
}
.controls button {
padding: 8px 16px;
margin-left: 10px;
border: none;
border-radius: 4px;
cursor: pointer;
}
#connect-btn {
background-color: #4caf50;
color: white;
}
#disconnect-btn {
background-color: #f44336;
color: white;
}
button:disabled {
opacity: 0.5;
cursor: not-allowed;
}
.main-content {
background-color: #fff;
border-radius: 8px;
padding: 20px;
margin-bottom: 20px;
}
.bot-container {
display: flex;
flex-direction: column;
align-items: center;
}
#bot-video-container {
width: 640px;
height: 360px;
background-color: #e0e0e0;
border-radius: 8px;
margin: 20px auto;
overflow: hidden;
display: flex;
align-items: center;
justify-content: center;
}
#bot-video-container video {
width: 100%;
height: 100%;
object-fit: cover;
}
.debug-panel {
background-color: #fff;
border-radius: 8px;
padding: 20px;
}
.debug-panel h3 {
margin: 0 0 10px 0;
font-size: 16px;
font-weight: bold;
}
#debug-log {
height: 200px;
overflow-y: auto;
background-color: #f8f8f8;
padding: 10px;
border-radius: 4px;
font-family: monospace;
font-size: 12px;
line-height: 1.4;
}

View File

@@ -1,13 +0,0 @@
import { defineConfig } from 'vite';
export default defineConfig({
server: {
proxy: {
// Proxy /api requests to the backend server
'/connect': {
target: 'http://0.0.0.0:7860', // Replace with your backend URL
changeOrigin: true,
},
},
},
});

View File

@@ -1,60 +0,0 @@
# React Native Implementation
Basic implementation using the [Pipecat React Native SDK](https://docs.pipecat.ai/client/react-native/introduction).
## Usage
### Expo requirements
This project cannot be used with an [Expo Go](https://docs.expo.dev/workflow/expo-go/) app because [it requires custom native code](https://docs.expo.io/workflow/customizing/).
When a project requires custom native code or a config plugin, we need to transition from using [Expo Go](https://docs.expo.dev/workflow/expo-go/)
to a [development build](https://docs.expo.dev/development/introduction/).
More details about the custom native code used by this demo can be found in [rn-daily-js-expo-config-plugin](https://github.com/daily-co/rn-daily-js-expo-config-plugin).
### Building remotely
If you do not have experience with Xcode and Android Studio builds or do not have them installed locally on your computer, you will need to follow [this guide from Expo to use EAS Build](https://docs.expo.dev/development/create-development-builds/#create-and-install-eas-build).
### Building locally
You will need to have installed locally on your computer:
- [Xcode](https://developer.apple.com/xcode/) to build for iOS;
- [Android Studio](https://developer.android.com/studio) to build for Android;
#### Install the demo dependencies
```bash
# Use the version of node specified in .nvmrc
nvm i
# Install dependencies
npm i
# Before a native app can be compiled, the native source code must be generated.
npx expo prebuild
# Configure the environment variable to connect to the local server
cp env.example .env
# edit .env and add your local ip address, for example: http://192.168.1.16:7860
```
#### Running on Android
After plugging in an Android device [configured for debugging](https://developer.android.com/studio/debug/dev-options), run the following command:
```
npm run android
```
#### Running on iOS
Run the following command:
```
npm run ios
```
#### Connect to the server
Use the http://localhost:5173 in your app.

View File

@@ -1,75 +0,0 @@
{
"expo": {
"name": "bot-ready-rn",
"slug": "bot-ready-rn",
"version": "1.0.0",
"orientation": "portrait",
"icon": "./assets/icon.png",
"userInterfaceStyle": "light",
"splash": {
"image": "./assets/splash.png",
"resizeMode": "contain",
"backgroundColor": "#ffffff"
},
"updates": {
"fallbackToCacheTimeout": 0
},
"assetBundlePatterns": [
"**/*"
],
"ios": {
"supportsTablet": true,
"bitcode": false,
"bundleIdentifier": "co.daily.expo.BotReady",
"infoPlist": {
"UIBackgroundModes": [
"voip"
]
},
"appleTeamId": "EEBGKV9N3N"
},
"android": {
"adaptiveIcon": {
"foregroundImage": "./assets/adaptive-icon.png",
"backgroundColor": "#FFFFFF"
},
"package": "co.daily.expo.BotReady",
"permissions": [
"android.permission.ACCESS_NETWORK_STATE",
"android.permission.BLUETOOTH",
"android.permission.CAMERA",
"android.permission.INTERNET",
"android.permission.MODIFY_AUDIO_SETTINGS",
"android.permission.RECORD_AUDIO",
"android.permission.SYSTEM_ALERT_WINDOW",
"android.permission.WAKE_LOCK",
"android.permission.FOREGROUND_SERVICE",
"android.permission.FOREGROUND_SERVICE_CAMERA",
"android.permission.FOREGROUND_SERVICE_MICROPHONE",
"android.permission.FOREGROUND_SERVICE_MEDIA_PROJECTION",
"android.permission.POST_NOTIFICATIONS"
]
},
"web": {
"favicon": "./assets/favicon.png"
},
"plugins": [
"@config-plugins/react-native-webrtc",
"@daily-co/config-plugin-rn-daily-js",
[
"expo-build-properties",
{
"android": {
"minSdkVersion": 24,
"compileSdkVersion": 35,
"targetSdkVersion": 34,
"buildToolsVersion": "35.0.0"
},
"ios": {
"deploymentTarget": "15.1"
}
}
]
]
}
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 17 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.4 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 46 KiB

View File

@@ -1,7 +0,0 @@
module.exports = function(api) {
api.cache(true);
return {
presets: ['babel-preset-expo'],
plugins: [["module:react-native-dotenv"]],
};
};

View File

@@ -1 +0,0 @@
API_BASE_URL=http://YOUR_LOCAL_IP:7860

View File

@@ -1,7 +0,0 @@
import { registerRootComponent } from "expo";
import App from "./src/App";
// registerRootComponent calls AppRegistry.registerComponent('main', () => App);
// It also ensures that the environment is set up appropriately
registerRootComponent(App);

View File

@@ -1,4 +0,0 @@
// Learn more https://docs.expo.io/guides/customizing-metro
const { getDefaultConfig } = require('expo/metro-config');
module.exports = getDefaultConfig(__dirname);

File diff suppressed because it is too large Load Diff

View File

@@ -1,31 +0,0 @@
{
"name": "bot-ready-rn",
"version": "1.0.0",
"scripts": {
"start": "expo start --dev-client",
"android": "expo run:android --device",
"ios": "expo run:ios --device",
"web": "expo start --web"
},
"dependencies": {
"@config-plugins/react-native-webrtc": "^10.0.0",
"@daily-co/config-plugin-rn-daily-js": "0.0.7",
"@daily-co/react-native-daily-js": "^0.70.0",
"@daily-co/react-native-webrtc": "^118.0.3-daily.2",
"@react-native-async-storage/async-storage": "1.23.1",
"expo": "^52.0.0",
"expo-build-properties": "~0.13.1",
"expo-dev-client": "~5.0.5",
"expo-splash-screen": "~0.29.16",
"expo-status-bar": "~2.0.0",
"react": "18.3.1",
"react-native": "0.76.3",
"react-native-background-timer": "^2.4.1",
"react-native-dotenv": "^3.4.11",
"react-native-get-random-values": "^1.11.0"
},
"devDependencies": {
"@babel/core": "^7.12.9"
},
"private": true
}

View File

@@ -1,121 +0,0 @@
import React, { useState, useEffect } from 'react';
import {SafeAreaView, View, Text, Button, StyleSheet, ScrollView} from 'react-native';
import Daily from "@daily-co/react-native-daily-js";
import { API_BASE_URL } from "@env";
const CallScreen = () => {
const [connectionStatus, setConnectionStatus] = useState('Disconnected');
const [isConnected, setIsConnected] = useState(false);
const [callObject, setCallObject] = useState(null);
const [logs, setLogs] = useState([]);
useEffect(() => {
if (callObject) {
setupTrackListeners(callObject);
}
}, [callObject]);
const log = (message) => {
setLogs((prevLogs) => [...prevLogs, `${new Date().toISOString()} - ${message}`]);
console.log(message);
};
const setupTrackListeners = (callObject) => {
callObject.on("joined-meeting", () => {
setConnectionStatus('Connected');
setIsConnected(true);
log('Client connected');
});
callObject.on("left-meeting", () => {
setConnectionStatus('Disconnected');
setIsConnected(false);
log('Client disconnected');
});
callObject.on("participant-left", () => {
// When the bot leaves, we are also disconnecting from the call
disconnect().catch((err) => {
log(`Failed to disconnect ${err}`);
})
});
// Trigger so the bot can start sending audio
callObject.on("track-started", (evt) => {
if (evt.track.kind === "audio" && evt.participant.local === false) {
handleEventToConsole(evt)
log("Sending the message that will trigger the bot to play the audio.")
callObject.sendAppMessage("playable")
}
});
callObject.on("error", (evt) => log(`Error: ${evt.error}`));
// Other events just for awareness
callObject.on("track-stopped", handleEventToConsole);
callObject.on("participant-joined", handleEventToConsole);
callObject.on("participant-updated", handleEventToConsole);
};
const handleEventToConsole = (evt) => {
log(`Received event: ${evt.action}`);
};
const connect = async () => {
try {
const callObject = Daily.createCallObject({ subscribeToTracksAutomatically: true });
setCallObject(callObject);
const connectionUrl = `${API_BASE_URL}/connect`
const res = await fetch(connectionUrl, { method: "POST", headers: { "Content-Type": "application/json" } });
const roomInfo = await res.json();
await callObject.join({ url: roomInfo.room_url });
} catch (error) {
log(`Error connecting: ${error.message}`);
}
};
const disconnect = async () => {
if (callObject) {
try {
await callObject.leave();
await callObject.destroy();
setCallObject(null);
} catch (error) {
log(`Error disconnecting: ${error.message}`);
}
}
};
return (
<SafeAreaView style={styles.safeArea}>
<View style={styles.container}>
<View style={styles.statusBar}>
<Text>Status: <Text style={styles.status}>{connectionStatus}</Text></Text>
<View style={styles.controls}>
<Button
title={isConnected ? "Disconnect" : "Connect"}
onPress={isConnected ? disconnect : connect}
/>
</View>
</View>
<View style={styles.debugPanel}>
<Text style={styles.debugTitle}>Debug Info</Text>
<ScrollView style={styles.debugLog}>
{logs.map((logEntry, index) => (
<Text key={index} style={styles.logText}>{logEntry}</Text>
))}
</ScrollView>
</View>
</View>
</SafeAreaView>
);
};
const styles = StyleSheet.create({
safeArea: { flex: 1, backgroundColor: '#f0f0f0', padding: 20 },
container: { flex: 1, margin: 20 },
statusBar: { flexDirection: 'row', justifyContent: 'space-between', alignItems: 'center', padding: 10, backgroundColor: '#fff', borderRadius: 8, marginBottom: 20 },
status: { fontWeight: 'bold' },
controls: { flexDirection: 'row', gap: 10 },
debugPanel: { height: '80%', backgroundColor: '#fff', borderRadius: 8, padding: 20},
debugTitle: { fontSize: 16, fontWeight: 'bold' },
debugLog: { height: '100%', overflow: 'scroll', backgroundColor: '#f8f8f8', padding: 10, borderRadius: 4, fontFamily: 'monospace', fontSize: 12, lineHeight: 1.4 },
});
export default CallScreen;

View File

@@ -1,50 +0,0 @@
# Bot ready signaling Server
A FastAPI server that manages bot instances and provide endpoint for Pipecat client connections.
## Endpoints
- `POST /connect` - Pipecat client connection endpoint
## Environment Variables
Copy `env.example` to `.env` and configure:
```ini
# Required API Keys
DAILY_API_KEY= # Your Daily API key
CARTESIA_API_KEY= # Your Cartesia API key
# Optional Configuration
DAILY_API_URL= # Optional: Daily API URL (defaults to https://api.daily.co/v1)
DAILY_SAMPLE_ROOM_URL= # Optional: Fixed room URL for development
HOST= # Optional: Host address (defaults to 0.0.0.0)
FAST_API_PORT= # Optional: Port number (defaults to 7860)
```
## Running the Server
Set up and activate your virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
Install dependencies:
```bash
pip install -r requirements.txt
```
If you want to use the local version of `pipecat` in this repo rather than the last published version, also run:
```bash
pip install --editable "../../../[daily,cartesia,openai]"
```
Run the server:
```bash
python server.py
```

View File

@@ -1,3 +0,0 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=
CARTESIA_API_KEY=

View File

@@ -1,4 +0,0 @@
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,cartesia,openai]

View File

@@ -1,64 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
from typing import Optional
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
(url, token, _) = await configure_with_args(aiohttp_session)
return (url, token)
async def configure_with_args(
aiohttp_session: aiohttp.ClientSession, parser: Optional[argparse.ArgumentParser] = None
):
if not parser:
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token, args)

View File

@@ -1,147 +0,0 @@
#
# Copyright (c) 2025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
from typing import Any, Dict
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
# Load environment variables from .env file
load_dotenv(override=True)
# Dictionary to track bot processes: {pid: (process, room_url)}
bot_procs = {}
# Store Daily API helpers
daily_helpers = {}
def cleanup():
"""Cleanup function to terminate all bot processes.
Called during server shutdown.
"""
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
"""FastAPI lifespan manager that handles startup and shutdown tasks.
- Creates aiohttp session
- Initializes Daily API helper
- Cleans up resources on shutdown
"""
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
# Initialize FastAPI app with lifespan manager
app = FastAPI(lifespan=lifespan)
# Configure CORS to allow requests from any origin
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
async def create_room_and_token() -> tuple[str, str]:
"""Helper function to create a Daily room and generate an access token.
Returns:
tuple[str, str]: A tuple containing (room_url, token)
Raises:
HTTPException: If room creation or token generation fails
"""
room = await daily_helpers["rest"].create_room(DailyRoomParams())
if not room.url:
raise HTTPException(status_code=500, detail="Failed to create room")
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
return room.url, token
@app.post("/connect")
async def bot_connect(request: Request) -> Dict[Any, Any]:
"""Connect endpoint that creates a room and returns connection credentials.
This endpoint is called by client to establish a connection.
Returns:
Dict[Any, Any]: Authentication bundle containing room_url and token
Raises:
HTTPException: If room creation, token generation, or bot startup fails
"""
print("Creating room for RTVI connection")
room_url, token = await create_room_and_token()
print(f"Room URL: {room_url}")
# Start the bot process
try:
bot_file = "signalling_bot"
proc = subprocess.Popen(
[f"python3 -m {bot_file} -u {room_url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room_url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
# Return the authentication bundle in format expected by DailyTransport
return {"room_url": room_url, "token": token}
if __name__ == "__main__":
import uvicorn
# Parse command line arguments for server configuration
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Travel Companion FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
# Start the FastAPI server
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -1,95 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
from dataclasses import dataclass
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.frames.frames import AudioRawFrame, EndFrame, OutputAudioRawFrame, TTSSpeakFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.cartesia.tts import CartesiaTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
@dataclass
class SilenceFrame(OutputAudioRawFrame):
def __init__(
self,
*,
sample_rate: int,
duration: float,
):
# Initialize the parent class with the silent frame's data
super().__init__(
audio=self.create_silent_audio_frame(sample_rate, 1, duration).audio,
sample_rate=sample_rate,
num_channels=1,
)
@staticmethod
def create_silent_audio_frame(
sample_rate: int, num_channels: int, duration: float
) -> AudioRawFrame:
"""Create an AudioRawFrame containing silence."""
frame_size = num_channels * 2 # 2 bytes per sample for 16-bit audio
total_frames = int(sample_rate * duration)
total_bytes = total_frames * frame_size
silent_audio = bytes(total_bytes) # Create a byte array filled with zeros
return AudioRawFrame(audio=silent_audio, sample_rate=sample_rate, num_channels=num_channels)
async def main():
async with aiohttp.ClientSession() as session:
(room_url, _) = await configure(session)
transport = DailyTransport(
room_url, None, "Say One Thing", DailyParams(audio_out_enabled=True)
)
tts = CartesiaTTSService(
api_key=os.getenv("CARTESIA_API_KEY"),
voice_id="71a7ad14-091c-4e8e-a314-022ece01c121", # British Reading Lady
)
runner = PipelineRunner()
task = PipelineTask(Pipeline([tts, transport.output()]))
# Register an event handler so we can play the audio when we receive a specific message
@transport.event_handler("on_app_message")
async def on_app_message(transport, message, sender):
logger.debug(f"Received app message: {message} - {sender}")
if "playable" not in message:
return
await task.queue_frames(
[
SilenceFrame(
sample_rate=task.params.audio_out_sample_rate,
duration=0.5,
),
TTSSpeakFrame(f"Hello there, how are you doing today ?"),
EndFrame(),
]
)
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,161 +0,0 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
recordings/
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
runpod.toml

View File

@@ -1,10 +0,0 @@
FROM python:3.10-bullseye
RUN mkdir /app
COPY *.py /app/
COPY requirements.txt /app/
WORKDIR /app
RUN pip3 install -r requirements.txt
EXPOSE 7860
CMD ["python3", "server.py"]

View File

@@ -1,66 +0,0 @@
# Chatbot with canonical-metrics
This project implements a chatbot using a pipeline architecture that integrates audio processing, transcription, and a language model for conversational interactions. The chatbot operates within a daily communication environment, utilizing various services for text-to-speech and language model responses.
## Features
- **Audio Input and Output**: Captures microphone input and plays back audio responses.
- **Voice Activity Detection**: Utilizes Silero VAD to manage audio input intelligently.
- **Text-to-Speech**: Integrates ElevenLabs TTS service to convert text responses into audio.
- **Language Model Interaction**: Uses OpenAI's GPT-4 model to generate responses based on user input.
- **Transcription Services**: Captures and transcribes participant speech for analytics.
- **Metrics Collection**: Sends audio data for analysis via Canonical Metrics Service.
## Requirements
- Python 3.10+
- `python-dotenv`
- Additional libraries from the `pipecat` package.
## Setup
1. Clone the repository.
2. Install the required packages.
3. Set up environment variables for API keys:
- `OPENAI_API_KEY`
- `ELEVENLABS_API_KEY`
- `CANONICAL_API_KEY`
- `CANONICAL_API_URL`
4. Run the script.
## Usage
The chatbot introduces itself and engages in conversations, providing brief and creative responses. Designed for flexibility, it can support multiple languages with appropriate configuration.
## Events
- Participants joining or leaving the call are handled dynamically, adjusting the chatbot's behavior accordingly.
The first time, things might take extra time to get started since VAD (Voice Activity Detection) model needs to be downloaded.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp env.example .env # and add your credentials
```
## Run the server
```bash
python server.py
```
Then, visit `http://localhost:7860/` in your browser to start a chatbot session.
## Build and test the Docker image
```
docker build -t chatbot .
docker run --env-file .env -p 7860:7860 chatbot
```

View File

@@ -1,146 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import os
import sys
import uuid
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import EndFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.services.canonical.metrics import CanonicalMetricsService
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
audio_out_enabled=True,
audio_in_enabled=True,
video_out_enabled=False,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
#
# Spanish
#
# transcription_settings=DailyTranscriptionSettings(
# language="es",
# tier="nova",
# model="2-general"
# )
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
#
# English
#
voice_id="cgSgspJ2msm6clMCkdW9",
#
# Spanish
#
# model="eleven_multilingual_v2",
# voice_id="gD1IexrzCvsXPHUuT0s3",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
#
# English
#
"content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself. Keep all your responses to 12 words or fewer.",
#
# Spanish
#
# "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
"""
CanonicalMetrics uses AudioBufferProcessor under the hood to buffer the audio. On
call completion, CanonicalMetrics will send the audio buffer to Canonical for
analysis. Visit https://voice.canonical.chat to learn more.
"""
audio_buffer_processor = AudioBufferProcessor(num_channels=2)
canonical = CanonicalMetricsService(
audio_buffer_processor=audio_buffer_processor,
aiohttp_session=session,
api_key=os.getenv("CANONICAL_API_KEY"),
call_id=str(uuid.uuid4()),
assistant="pipecat-chatbot",
assistant_speaks_first=True,
context=context,
)
pipeline = Pipeline(
[
transport.input(), # microphone
context_aggregator.user(),
llm,
tts,
transport.output(),
canonical, # uploads audio buffer to Canonical AI for metrics
audio_buffer_processor, # captures audio into a buffer
context_aggregator.assistant(),
]
)
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await audio_buffer_processor.start_recording()
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
print(f"Participant left: {participant}")
await task.cancel()
@transport.event_handler("on_call_state_updated")
async def on_call_state_updated(transport, state):
if state == "left":
# Here we don't want to cancel, we just want to finish sending
# whatever is queued, so we use an EndFrame().
await task.queue_frame(EndFrame())
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,6 +0,0 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=7df...
OPENAI_API_KEY=sk-PL...
ELEVENLABS_API_KEY=aeb...
CANONICAL_API_KEY=can...
CANONICAL_API_URL=

View File

@@ -1,5 +0,0 @@
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,openai,silero,elevenlabs,canonical]

View File

@@ -1,55 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)

View File

@@ -1,139 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse, RedirectResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
MAX_BOTS_PER_ROOM = 1
# Bot sub-process dict for status reporting and concurrency control
bot_procs = {}
daily_helpers = {}
load_dotenv(override=True)
def cleanup():
# Clean up function, just to be extra safe
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/")
async def start_agent(request: Request):
print(f"!!! Creating room")
room = await daily_helpers["rest"].create_room(DailyRoomParams())
print(f"!!! Room URL: {room.url}")
# Ensure the room property is present
if not room.url:
raise HTTPException(
status_code=500,
detail="Missing 'room' property in request data. Cannot start agent without a target room!",
)
# Check if there is already an existing process running in this room
num_bots_in_room = sum(
1 for proc in bot_procs.values() if proc[1] == room.url and proc[0].poll() is None
)
if num_bots_in_room >= MAX_BOTS_PER_ROOM:
raise HTTPException(status_code=500, detail=f"Max bot limited reach for room: {room.url}")
# Get the token for the room
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
# Spawn a new agent, and join the user session
# Note: this is mostly for demonstration purposes (refer to 'deployment' in README)
try:
proc = subprocess.Popen(
[f"python3 -m bot -u {room.url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room.url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
return RedirectResponse(room.url)
@app.get("/status/{pid}")
def get_status(pid: int):
# Look up the subprocess
proc = bot_procs.get(pid)
# If the subprocess doesn't exist, return an error
if not proc:
raise HTTPException(status_code=404, detail=f"Bot with process id: {pid} not found")
# Check the status of the subprocess
if proc[0].poll() is None:
status = "running"
else:
status = "finished"
return JSONResponse({"bot_id": pid, "status": status})
if __name__ == "__main__":
import uvicorn
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Storyteller FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -1,161 +0,0 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
runpod.toml

View File

@@ -1,15 +0,0 @@
FROM python:3.10-bullseye
RUN mkdir /app
RUN mkdir /app/assets
RUN mkdir /app/utils
COPY *.py /app/
COPY requirements.txt /app/
WORKDIR /app
RUN pip3 install -r requirements.txt
EXPOSE 7860
CMD ["python3", "server.py"]

View File

@@ -1,37 +0,0 @@
# Simple Chatbot
<img src="image.png" width="420px">
This app connects you to a chatbot powered by GPT-4, complete with animations generated by Stable Video Diffusion.
See a video of it in action: https://x.com/kwindla/status/1778628911817183509
And a quick video walkthrough of the code: https://www.loom.com/share/13df1967161f4d24ade054e7f8753416
The first time, things might take extra time to get started since VAD (Voice Activity Detection) model needs to be downloaded.
## Get started
```python
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp env.example .env # and add your credentials
```
## Run the server
```bash
python server.py
```
Then, visit `http://localhost:7860/` in your browser to start a chatbot session.
## Build and test the Docker image
```
docker build -t chatbot .
docker run --env-file .env -p 7860:7860 chatbot
```

View File

@@ -1,162 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
import datetime
import io
import os
import sys
import wave
import aiofiles
import aiohttp
from dotenv import load_dotenv
from loguru import logger
from runner import configure
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.processors.audio.audio_buffer_processor import AudioBufferProcessor
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
# Create the recordings directory if it doesn't exist
os.makedirs("recordings", exist_ok=True)
async def save_audio(audio: bytes, sample_rate: int, num_channels: int, name: str):
if len(audio) > 0:
filename = os.path.join(
"recordings",
f"{name}_conversation_recording{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}.wav",
)
with io.BytesIO() as buffer:
with wave.open(buffer, "wb") as wf:
wf.setsampwidth(2)
wf.setnchannels(num_channels)
wf.setframerate(sample_rate)
wf.writeframes(audio)
async with aiofiles.open(filename, "wb") as file:
await file.write(buffer.getvalue())
print(f"Merged audio saved to {filename}")
else:
print("No audio data to save")
async def main():
async with aiohttp.ClientSession() as session:
(room_url, token) = await configure(session)
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
audio_out_enabled=True,
audio_in_enabled=True,
video_out_enabled=False,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
#
# Spanish
#
# transcription_settings=DailyTranscriptionSettings(
# language="es",
# tier="nova",
# model="2-general"
# )
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
#
# English
#
voice_id="cgSgspJ2msm6clMCkdW9",
#
# Spanish
#
# model="eleven_multilingual_v2",
# voice_id="gD1IexrzCvsXPHUuT0s3",
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
#
# English
#
"content": "You are Chatbot, a friendly, helpful robot. Your goal is to demonstrate your capabilities in a succinct way. Your output will be converted to audio so don't include special characters in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by introducing yourself. Keep all your response to 12 words or fewer.",
#
# Spanish
#
# "content": "Eres Chatbot, un amigable y útil robot. Tu objetivo es demostrar tus capacidades de una manera breve. Tus respuestas se convertiran a audio así que nunca no debes incluir caracteres especiales. Contesta a lo que el usuario pregunte de una manera creativa, útil y breve. Empieza por presentarte a ti mismo.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
# NOTE: Watch out! This will save all the conversation in memory. You
# can pass `buffer_size` to get periodic callbacks.
audiobuffer = AudioBufferProcessor(enable_turn_audio=True)
pipeline = Pipeline(
[
transport.input(), # microphone
context_aggregator.user(),
llm,
tts,
transport.output(),
audiobuffer, # used to buffer the audio in the pipeline
context_aggregator.assistant(),
]
)
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@audiobuffer.event_handler("on_audio_data")
async def on_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels, "full")
@audiobuffer.event_handler("on_user_turn_audio_data")
async def on_user_turn_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels, "user")
@audiobuffer.event_handler("on_bot_turn_audio_data")
async def on_bot_turn_audio_data(buffer, audio, sample_rate, num_channels):
await save_audio(audio, sample_rate, num_channels, "bot")
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await audiobuffer.start_recording()
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
print(f"Participant left: {participant}")
await task.cancel()
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
asyncio.run(main())

View File

@@ -1,4 +0,0 @@
DAILY_SAMPLE_ROOM_URL=https://yourdomain.daily.co/yourroom # (for joining the bot to the same room repeatedly for local dev)
DAILY_API_KEY=7df...
OPENAI_API_KEY=sk-PL...
ELEVENLABS_API_KEY=aeb...

View File

@@ -1,5 +0,0 @@
aiofiles
python-dotenv
fastapi[all]
uvicorn
pipecat-ai[daily,openai,silero,elevenlabs]

View File

@@ -1,56 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import aiohttp
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper
async def configure(aiohttp_session: aiohttp.ClientSession):
parser = argparse.ArgumentParser(description="Daily AI SDK Bot Sample")
parser.add_argument(
"-u", "--url", type=str, required=False, help="URL of the Daily room to join"
)
parser.add_argument(
"-k",
"--apikey",
type=str,
required=False,
help="Daily API Key (needed to create an owner token for the room)",
)
args, unknown = parser.parse_known_args()
url = args.url or os.getenv("DAILY_SAMPLE_ROOM_URL")
key = args.apikey or os.getenv("DAILY_API_KEY")
if not url:
raise Exception(
"No Daily room specified. use the -u/--url option from the command line, or set DAILY_SAMPLE_ROOM_URL in your environment to specify a Daily room URL."
)
if not key:
raise Exception(
"No Daily API key specified. use the -k/--apikey option from the command line, or set DAILY_API_KEY in your environment to specify a Daily API key, available from https://dashboard.daily.co/developers."
)
daily_rest_helper = DailyRESTHelper(
daily_api_key=key,
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
# Create a meeting token for the given room with an expiration 1 hour in
# the future.
expiry_time: float = 60 * 60
token = await daily_rest_helper.get_token(url, expiry_time)
return (url, token)
return (url, token)

View File

@@ -1,139 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse, RedirectResponse
from pipecat.transports.services.helpers.daily_rest import DailyRESTHelper, DailyRoomParams
MAX_BOTS_PER_ROOM = 1
# Bot sub-process dict for status reporting and concurrency control
bot_procs = {}
daily_helpers = {}
load_dotenv(override=True)
def cleanup():
# Clean up function, just to be extra safe
for entry in bot_procs.values():
proc = entry[0]
proc.terminate()
proc.wait()
@asynccontextmanager
async def lifespan(app: FastAPI):
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
cleanup()
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
@app.get("/")
async def start_agent(request: Request):
print(f"!!! Creating room")
room = await daily_helpers["rest"].create_room(DailyRoomParams())
print(f"!!! Room URL: {room.url}")
# Ensure the room property is present
if not room.url:
raise HTTPException(
status_code=500,
detail="Missing 'room' property in request data. Cannot start agent without a target room!",
)
# Check if there is already an existing process running in this room
num_bots_in_room = sum(
1 for proc in bot_procs.values() if proc[1] == room.url and proc[0].poll() is None
)
if num_bots_in_room >= MAX_BOTS_PER_ROOM:
raise HTTPException(status_code=500, detail=f"Max bot limited reach for room: {room.url}")
# Get the token for the room
token = await daily_helpers["rest"].get_token(room.url)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
# Spawn a new agent, and join the user session
# Note: this is mostly for demonstration purposes (refer to 'deployment' in README)
try:
proc = subprocess.Popen(
[f"python3 -m bot -u {room.url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
bot_procs[proc.pid] = (proc, room.url)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
return RedirectResponse(room.url)
@app.get("/status/{pid}")
def get_status(pid: int):
# Look up the subprocess
proc = bot_procs.get(pid)
# If the subprocess doesn't exist, return an error
if not proc:
raise HTTPException(status_code=404, detail=f"Bot with process id: {pid} not found")
# Check the status of the subprocess
if proc[0].poll() is None:
status = "running"
else:
status = "finished"
return JSONResponse({"bot_id": pid, "status": status})
if __name__ == "__main__":
import uvicorn
default_host = os.getenv("HOST", "0.0.0.0")
default_port = int(os.getenv("FAST_API_PORT", "7860"))
parser = argparse.ArgumentParser(description="Daily Storyteller FastAPI server")
parser.add_argument("--host", type=str, default=default_host, help="Host address")
parser.add_argument("--port", type=int, default=default_port, help="Port number")
parser.add_argument("--reload", action="store_true", help="Reload code on change")
config = parser.parse_args()
uvicorn.run(
"server:app",
host=config.host,
port=config.port,
reload=config.reload,
)

View File

@@ -1,13 +0,0 @@
FROM python:3.11-bullseye
# Open port 7860 for http service
ENV FAST_API_PORT=7860
EXPOSE 7860
# Install Python dependencies
COPY *.py .
COPY ./requirements.txt requirements.txt
RUN pip3 install --no-cache-dir --upgrade -r requirements.txt
# Start the FastAPI server
CMD python3 bot_runner.py --port ${FAST_API_PORT}

View File

@@ -1,39 +0,0 @@
# Fly.io deployment example
This project modifies the `bot_runner.py` server to launch a new machine for each user session. This is a recommended approach for production vs. running shell processess as your deployment will quickly run out of system resources under load.
For this example, we are using Daily as a WebRTC transport and provisioning a new room and token for each session. You can use another transport, such as WebSockets, by modifying the `bot.py` and `bot_runner.py` files accordingly.
## Setting up your fly.io deployment
### Create your fly.toml file
You can copy the `example-fly.toml` as a reference. Be sure to change the app name to something unique.
### Create your .env file
Copy the base `env.example` to `.env` and enter the necessary API keys.
`FLY_APP_NAME` should match that in the `fly.toml` file.
### Launch a new fly.io project
`fly launch` or `fly launch --org your-org-name`
### Set the necessary app secrets from your .env
Note: you can do this manually via the fly.io dashboard under the "secrets" sub-section of your deployment (e.g. "https://fly.io/apps/fly-app-name/secrets") or run the following terminal command:
`cat .env | tr '\n' ' ' | xargs flyctl secrets set`
### Deploy your machine
`fly deploy`
## Connecting to your bot
Send a post request to your running fly.io instance:
`curl --location --request POST 'https://YOUR_FLY_APP_NAME/'`
This request will wait until the machine enters into a `starting` state, before returning the a room URL and token to join.

View File

@@ -1,107 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import asyncio
import os
import sys
from dotenv import load_dotenv
from loguru import logger
from pipecat.audio.vad.silero import SileroVADAnalyzer
from pipecat.frames.frames import EndFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import OpenAILLMContext
from pipecat.services.elevenlabs.tts import ElevenLabsTTSService
from pipecat.services.openai.llm import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
load_dotenv(override=True)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")
daily_api_key = os.getenv("DAILY_API_KEY", "")
daily_api_url = os.getenv("DAILY_API_URL", "https://api.daily.co/v1")
async def main(room_url: str, token: str):
transport = DailyTransport(
room_url,
token,
"Chatbot",
DailyParams(
api_url=daily_api_url,
api_key=daily_api_key,
audio_in_enabled=True,
audio_out_enabled=True,
video_out_enabled=False,
vad_analyzer=SileroVADAnalyzer(),
transcription_enabled=True,
),
)
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY", ""),
voice_id=os.getenv("ELEVENLABS_VOICE_ID", ""),
)
llm = OpenAILLMService(api_key=os.getenv("OPENAI_API_KEY"))
messages = [
{
"role": "system",
"content": "You are Chatbot, a friendly, helpful robot. Your output will be converted to audio so don't include special characters other than '!' or '?' in your answers. Respond to what the user said in a creative and helpful way, but keep your responses brief. Start by saying hello.",
},
]
context = OpenAILLMContext(messages)
context_aggregator = llm.create_context_aggregator(context)
pipeline = Pipeline(
[
transport.input(),
context_aggregator.user(),
llm,
tts,
transport.output(),
context_aggregator.assistant(),
]
)
task = PipelineTask(pipeline, params=PipelineParams(allow_interruptions=True))
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.capture_participant_transcription(participant["id"])
await task.queue_frames([context_aggregator.user().get_context_frame()])
@transport.event_handler("on_participant_left")
async def on_participant_left(transport, participant, reason):
await task.cancel()
@transport.event_handler("on_call_state_updated")
async def on_call_state_updated(transport, state):
if state == "left":
# Here we don't want to cancel, we just want to finish sending
# whatever is queued, so we use an EndFrame().
await task.queue_frame(EndFrame())
runner = PipelineRunner()
await runner.run(task)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Pipecat Bot")
parser.add_argument("-u", type=str, help="Room URL")
parser.add_argument("-t", type=str, help="Token")
config = parser.parse_args()
asyncio.run(main(config.u, config.t))

View File

@@ -1,209 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import argparse
import os
import subprocess
from contextlib import asynccontextmanager
import aiohttp
from dotenv import load_dotenv
from fastapi import FastAPI, HTTPException, Request
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import JSONResponse
from pipecat.transports.services.helpers.daily_rest import (
DailyRESTHelper,
DailyRoomObject,
DailyRoomParams,
DailyRoomProperties,
)
load_dotenv(override=True)
# ------------ Configuration ------------ #
MAX_SESSION_TIME = 5 * 60 # 5 minutes
REQUIRED_ENV_VARS = [
"DAILY_API_KEY",
"OPENAI_API_KEY",
"ELEVENLABS_API_KEY",
"ELEVENLABS_VOICE_ID",
"FLY_API_KEY",
"FLY_APP_NAME",
]
FLY_API_HOST = os.getenv("FLY_API_HOST", "https://api.machines.dev/v1")
FLY_APP_NAME = os.getenv("FLY_APP_NAME", "pipecat-fly-example")
FLY_API_KEY = os.getenv("FLY_API_KEY", "")
FLY_HEADERS = {"Authorization": f"Bearer {FLY_API_KEY}", "Content-Type": "application/json"}
daily_helpers = {}
# ----------------- API ----------------- #
@asynccontextmanager
async def lifespan(app: FastAPI):
aiohttp_session = aiohttp.ClientSession()
daily_helpers["rest"] = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=aiohttp_session,
)
yield
await aiohttp_session.close()
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# ----------------- Main ----------------- #
async def spawn_fly_machine(room_url: str, token: str):
async with aiohttp.ClientSession() as session:
# Use the same image as the bot runner
async with session.get(
f"{FLY_API_HOST}/apps/{FLY_APP_NAME}/machines", headers=FLY_HEADERS
) as r:
if r.status != 200:
text = await r.text()
raise Exception(f"Unable to get machine info from Fly: {text}")
data = await r.json()
image = data[0]["config"]["image"]
# Machine configuration
cmd = f"python3 bot.py -u {room_url} -t {token}"
cmd = cmd.split()
worker_props = {
"config": {
"image": image,
"auto_destroy": True,
"init": {"cmd": cmd},
"restart": {"policy": "no"},
"guest": {"cpu_kind": "shared", "cpus": 1, "memory_mb": 1024},
},
}
# Spawn a new machine instance
async with session.post(
f"{FLY_API_HOST}/apps/{FLY_APP_NAME}/machines", headers=FLY_HEADERS, json=worker_props
) as r:
if r.status != 200:
text = await r.text()
raise Exception(f"Problem starting a bot worker: {text}")
data = await r.json()
# Wait for the machine to enter the started state
vm_id = data["id"]
async with session.get(
f"{FLY_API_HOST}/apps/{FLY_APP_NAME}/machines/{vm_id}/wait?state=started",
headers=FLY_HEADERS,
) as r:
if r.status != 200:
text = await r.text()
raise Exception(f"Bot was unable to enter started state: {text}")
print(f"Machine joined room: {room_url}")
@app.post("/")
async def start_bot(request: Request) -> JSONResponse:
try:
data = await request.json()
# Is this a webhook creation request?
if "test" in data:
return JSONResponse({"test": True})
except Exception as e:
pass
# Use specified room URL, or create a new one if not specified
room_url = os.getenv("DAILY_SAMPLE_ROOM_URL", "")
if not room_url:
params = DailyRoomParams(properties=DailyRoomProperties())
try:
room: DailyRoomObject = await daily_helpers["rest"].create_room(params=params)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Unable to provision room {e}")
else:
# Check passed room URL exists, we should assume that it already has a sip set up
try:
room: DailyRoomObject = await daily_helpers["rest"].get_room_from_url(room_url)
except Exception:
raise HTTPException(status_code=500, detail=f"Room not found: {room_url}")
# Give the agent a token to join the session
token = await daily_helpers["rest"].get_token(room.url, MAX_SESSION_TIME)
if not room or not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room_url}")
# Launch a new fly.io machine, or run as a shell process (not recommended)
run_as_process = os.getenv("RUN_AS_PROCESS", False)
if run_as_process:
try:
subprocess.Popen(
[f"python3 -m bot -u {room.url} -t {token}"],
shell=True,
bufsize=1,
cwd=os.path.dirname(os.path.abspath(__file__)),
)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to start subprocess: {e}")
else:
try:
await spawn_fly_machine(room.url, token)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to spawn VM: {e}")
# Grab a token for the user to join with
user_token = await daily_helpers["rest"].get_token(room.url, MAX_SESSION_TIME)
return JSONResponse(
{
"room_url": room.url,
"token": user_token,
}
)
if __name__ == "__main__":
# Check environment variables
for env_var in REQUIRED_ENV_VARS:
if env_var not in os.environ:
raise Exception(f"Missing environment variable: {env_var}.")
parser = argparse.ArgumentParser(description="Pipecat Bot Runner")
parser.add_argument(
"--host", type=str, default=os.getenv("HOST", "0.0.0.0"), help="Host address"
)
parser.add_argument("--port", type=int, default=os.getenv("PORT", 7860), help="Port number")
parser.add_argument(
"--reload", action="store_true", default=False, help="Reload code on change"
)
config = parser.parse_args()
try:
import uvicorn
uvicorn.run("bot_runner:app", host=config.host, port=config.port, reload=config.reload)
except KeyboardInterrupt:
print("Pipecat runner shutting down...")

View File

@@ -1,8 +0,0 @@
DAILY_API_KEY=
DAILY_SAMPLE_ROOM_URL= # Enter a Daily room URL to use a set room URL each time (useful for local testing)
OPENAI_API_KEY=
ELEVENLABS_API_KEY=
ELEVENLABS_VOICE_ID=
FLY_API_KEY=
FLY_APP_NAME=
RUN_AS_PROCESS= # Spawn fly.io machine for each session or run as local process

View File

@@ -1,25 +0,0 @@
# fly.toml app configuration file generated for pipecat-fly-example on 2024-07-01T15:04:53+01:00
#
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
#
app = 'pipecat-fly-example'
primary_region = 'sjc'
[build]
[env]
FLY_APP_NAME = 'pipecat-fly-example'
[http_service]
internal_port = 7860
force_https = true
auto_stop_machines = true
auto_start_machines = true
min_machines_running = 0
processes = ['app']
[[vm]]
memory = 512
cpu_kind = 'shared'
cpus = 1

View File

@@ -1,5 +0,0 @@
pipecat-ai[daily,openai,silero]
fastapi
uvicorn
python-dotenv
loguru

View File

@@ -1,91 +0,0 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
dist/
*.egg-info/
*.egg
.installed.cfg
.eggs/
downloads/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
MANIFEST
# Virtual Environments
venv/
env/
.env
.venv/
ENV/
env.bak/
venv.bak/
# IDE
.idea/
.vscode/
.spyderproject
.spyproject
.ropeproject
# Testing and Coverage
.coverage
.coverage.*
htmlcov/
.pytest_cache/
.tox/
.nox/
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
cover/
# Logs and Databases
*.log
*.db
db.sqlite3
db.sqlite3-journal
pip-log.txt
# System Files
.DS_Store
Thumbs.db
desktop.ini
*.swp
*.swo
*.bak
*.tmp
*~
# Build and Documentation
docs/_build/
.pybuilder/
target/
instance/
.webassets-cache
.pdm.toml
.pdm-python
.pdm-build/
__pypackages__/
# Other
*.mo
*.pot
*.sage.py
.mypy_cache/
.dmypy.json
dmypy.json
.pyre/
.pytype/
cython_debug/
.ipynb_checkpoints

View File

@@ -1,37 +0,0 @@
# Deploying Pipecat to Modal.com
Barebones deployment example for [modal.com](https://www.modal.com)
1. Install dependencies
```bash
python -m venv venv
source venv/bin/active # or OS equivalent
pip install -r requirements.txt
```
2. Setup .env
```bash
cp env.example .env
```
Alternatively, you can configure your Modal app to use [secrets](https://modal.com/docs/guide/secrets)
3. Test the app locally
```bash
modal serve app.py
```
4. Deploy to production
```bash
modal deploy app.py
```
## Configuration options
This app sets some sensible defaults for reducing cold starts, such as `minkeep_warm=1`, which will keep at least 1 warm instance ready for your bot function.
It has been configured to only allow a concurrency of 1 (`max_inputs=1`) as each user will require their own running function.

View File

@@ -1,80 +0,0 @@
#
# Copyright (c) 20242025, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import os
import aiohttp
import modal
from bot import _voice_bot_process
from fastapi import HTTPException
from fastapi.responses import JSONResponse
from loguru import logger
MAX_SESSION_TIME = 15 * 60 # 15 minutes
app = modal.App("pipecat-modal")
image = modal.Image.debian_slim(python_version="3.12").pip_install_from_requirements(
"requirements.txt"
)
@app.function(
image=image,
cpu=1.0,
secrets=[modal.Secret.from_dotenv()],
keep_warm=1,
enable_memory_snapshot=True,
max_inputs=1, # Do not reuse instances across requests
retries=0,
)
def launch_bot_process(room_url: str, token: str):
_voice_bot_process(room_url, token)
@app.function(
image=image,
secrets=[modal.Secret.from_dotenv()],
)
@modal.web_endpoint(method="POST")
async def start():
from pipecat.transports.services.helpers.daily_rest import (
DailyRESTHelper,
DailyRoomParams,
)
logger.info("Request received")
async with aiohttp.ClientSession() as session:
daily_rest_helper = DailyRESTHelper(
daily_api_key=os.getenv("DAILY_API_KEY", ""),
daily_api_url=os.getenv("DAILY_API_URL", "https://api.daily.co/v1"),
aiohttp_session=session,
)
# Create new Daily room
room = await daily_rest_helper.create_room(DailyRoomParams())
if not room.url:
raise HTTPException(
status_code=500,
detail="Unable to create room",
)
logger.info(f"Created room: {room.url}")
# Create bot token for room
token = await daily_rest_helper.get_token(room.url, MAX_SESSION_TIME)
if not token:
raise HTTPException(status_code=500, detail=f"Failed to get token for room: {room.url}")
logger.info(f"Bot token created: {token}")
# Spawn a new bot process
launch_bot_process.spawn(room_url=room.url, token=token)
# Return room URL to the user to join
# Note: in production, you would want to return a token to the user
return JSONResponse(content={"room_url": room.url, token: token})

Some files were not shown because too many files have changed in this diff Show More