Compare commits

...

151 Commits

Author SHA1 Message Date
filipi87
16133a2323 Removing the custom prompt. 2026-04-01 16:05:09 -03:00
filipi87
9d815cb5d2 Merge branch 'filipi/async_tools' into filipi/async_tools_structured_data 2026-04-01 15:50:35 -03:00
filipi87
2d87edac18 Merge branch 'main' into filipi/async_tools 2026-04-01 15:49:43 -03:00
filipi87
bce07e0c76 Merge branch 'filipi/async_tools' into filipi/async_tools_structured_data 2026-04-01 15:48:22 -03:00
filipi87
59092fe4fe Renaming the examples to match main. 2026-04-01 15:42:50 -03:00
filipi87
d515a81073 Updating the Anthropic example to use async function calls. 2026-04-01 15:31:32 -03:00
kompfner
a3c7f6c2af Merge pull request #4215 from pipecat-ai/pk/remove-openaillmcontext
Remove deprecated `OpenAILLMContext` as well as everything (code path…
2026-04-01 14:03:35 -04:00
Paul Kompfner
df68665ec1 Add changelog entries for OpenAILLMContext removal 2026-04-01 14:03:08 -04:00
filipi87
e23cb46885 Trying to structure async tool responses and improve the LLM prompt to teach it how to handle them. 2026-04-01 14:48:09 -03:00
Harshita Jain
bd6cbd7fe7 feat: add Smallest AI STT service integration (#4162)
Add SmallestSTTService using the Pulse WebSocket API for real-time
transcription. Includes SmallestSTTSettings dataclass, 32-language
support with resolve_language fallback, VAD-driven finalize signal,
and SMALLEST_TTFS_P99 latency constant.   

Also adds X-Source and X-Pipecat-Version headers to Smallest STT
and TTS WebSocket connections.
2026-04-01 13:44:04 -04:00
filipi87
72bbad51b7 Added group_parallel_tools parameter to LLMService. 2026-04-01 13:51:30 -03:00
filipi87
c066a913fe Adding changelogs for all the fixes. 2026-04-01 12:20:58 -03:00
filipi87
63bbfc3b27 Creating the concept of a group_id for the function calls. 2026-04-01 12:05:09 -03:00
filipi87
2458b9d42b Delaying the response for the get_current_weather in the openai example. 2026-04-01 10:47:29 -03:00
filipi87
4543aef3d9 Only pushing a context frame when we receive the function call result if the user is not speaking. 2026-04-01 10:45:00 -03:00
filipi87
260368b6f4 Fixing an issue where the BotOutputTransport was discarding the UninterruptibleFrames. 2026-04-01 10:32:11 -03:00
filipi87
3ad2675b24 Creating UninterruptibleProcessQueue. 2026-04-01 10:28:52 -03:00
filipi87
970d713d7a Using a JSON to send the result. 2026-04-01 10:28:03 -03:00
Mark Backman
33ef6b3174 Merge pull request #4218 from pipecat-ai/mb/rename-all-examples
Rename all examples
2026-04-01 07:15:57 -04:00
Mark Backman
3ca656cae5 Update simli name to match others 2026-03-31 22:54:21 -04:00
Mark Backman
6a84d02156 Update evals
- Removed evals for removed services
- Added eval for function-calling-deepseek.py
2026-03-31 22:13:52 -04:00
Mark Backman
080da8b94c Update eval script paths to match renamed example files 2026-03-31 22:09:42 -04:00
Mark Backman
d3021b4590 Rename example files to prepend parent folder name, preventing package shadowing
Example files like openai.py shadow installed packages when Python adds the
script directory to sys.path. Prepend the parent folder name to each example
file (e.g. openai.py -> function-calling-openai.py). Also split
thinking-and-mcp/ into separate mcp/ and thinking/ directories.
2026-03-31 22:06:01 -04:00
Paul Kompfner
92e34ea6e8 Fix potential UnboundLocalError for system_message in tracing decorator
Restore the `system_message = None` initialization that was dropped
when collapsing the OpenAILLMContext branch.
2026-03-31 21:00:51 -04:00
Paul Kompfner
ebab75765d Fix stream cancellation tests to mock get_chat_completions
The tests were mocking the removed _stream_chat_completions_*_context
methods. Update them to mock get_chat_completions instead.
2026-03-31 18:54:23 -04:00
Paul Kompfner
110c88bf92 Remove stale re-export of deleted google.openai subpackage 2026-03-31 18:53:55 -04:00
Paul Kompfner
19e521b75a Simplify LLMContextFrame handling in process_frame methods
Now that LLMContextFrame is the only frame that provides a context,
remove the intermediate `context = None` / `if context:` pattern
and handle context processing directly in the isinstance branch.
2026-03-31 18:35:48 -04:00
Paul Kompfner
394599d031 Remove deprecated OpenAILLMContext as well as everything (code paths or whole types) dependent on it (all of which were also deprecated) 2026-03-31 18:15:25 -04:00
filipi87
f7012c570c Fixed an issue in the FrameProcessor where only the current frame was checked for being an UninterruptibleFrame, not other frames in the queue. 2026-03-31 18:38:11 -03:00
filipi87
4bfa084f77 Updating the openai example to be async. 2026-03-31 17:37:39 -03:00
filipi87
780d6c476d Merge branch 'main' into filipi/async_tools 2026-03-31 17:36:40 -03:00
filipi87
dfdb92958b Fix async tool handling for compatibility with all LLMs. 2026-03-31 17:26:06 -03:00
mattie ruth backman
0f47076703 More RTVI version parsing improvements 2026-03-31 16:05:53 -04:00
mattie ruth backman
3e255f3d21 improve version format check 2026-03-31 16:05:53 -04:00
mattie ruth backman
565b9b961d add tests for rtvi versioning 2026-03-31 16:05:53 -04:00
mattie ruth backman
692c3c74d1 We should now expect clients to be version 1.0.0 with valid versioning info 2026-03-31 16:05:53 -04:00
Mark Backman
7d309b3340 Merge pull request #4208 from pipecat-ai/mb/remove-deprecated-services
Remove deprecated service module shims
2026-03-31 15:37:12 -04:00
Mark Backman
04e8444096 Add changelog for #4208 2026-03-31 15:34:16 -04:00
Mark Backman
7501effad5 Remove deprecated service module shims and old implementations
Delete deprecated import shims that only re-export from new locations:
- services/ai_services.py
- services/gemini_multimodal_live/
- services/aws_nova_sonic/
- services/openai_realtime/
- services/deepgram/{stt,tts}_sagemaker.py
- services/google/{llm_openai,llm_vertex,google}.py
- services/google/gemini_live/llm_vertex.py
- services/riva/
- services/nim/

Remove deprecated implementations replaced by newer services:
- services/openai_realtime_beta/ (use openai.realtime)
- services/google/openai/ (use google.llm)

Also removes associated examples and tests for deleted services.
2026-03-31 15:34:14 -04:00
Mark Backman
0c8ff9c4c3 Merge pull request #4209 from pipecat-ai/mb/grok-3-default
Change GrokLLMService default model to grok-3
2026-03-31 15:29:34 -04:00
Mark Backman
53f6426b0b Merge pull request #4216 from pipecat-ai/mb/add-missing-google-vertex
Add missing google-vertex.py file
2026-03-31 15:29:04 -04:00
Mark Backman
9e32ade44b Merge pull request #4203 from pipecat-ai/mb/fix-json-decode-tool-calls
Handle incomplete function call arguments from interrupted LLM streams
2026-03-31 15:28:53 -04:00
Mark Backman
2574d24400 Merge pull request #4202 from pipecat-ai/mb/fix-inworld-tts-streaming-utf8
Fix UTF-8 decode error in Inworld TTS streaming response
2026-03-31 15:28:37 -04:00
Mark Backman
27cb078716 Add missing google-vertex.py file 2026-03-31 15:25:52 -04:00
Mark Backman
ca636813a8 Merge pull request #4206 from pipecat-ai/mb/flatten-examples-dir
Move foundational examples to examples/
2026-03-31 15:23:49 -04:00
Mark Backman
47b41a0ff7 Rename services/ to voice/ and function-calling/, flatten to top level
Replace the nested services/speech/ and services/function-calling/ with
top-level voice/ and function-calling/ directories. Update eval script
paths and README to match.
2026-03-31 15:20:03 -04:00
Mark Backman
f14638a1fd Revert "Flatten services/ nesting: promote speech and function-calling to top level"
This reverts commit e1939ecd44.
2026-03-31 14:59:23 -04:00
Mark Backman
e1939ecd44 Flatten services/ nesting: promote speech and function-calling to top level
Move services/speech/* directly into services/ and services/function-calling/*
into top-level function-calling/. Update eval script paths and README.
2026-03-31 14:55:22 -04:00
Mark Backman
dc5b94f9e0 Merge pull request #4213 from pipecat-ai/mb/google-imagen-4
Update default Google Imagen model to imagen-4.0
2026-03-31 13:20:20 -04:00
Mark Backman
1d85aedcae Split features/ into audio/, observability/, and rag/ subfolders
Extract focused example groups from the catch-all features/ folder:
- audio/: audio recording, background sound, sound effects
- observability/: observer, heartbeats, sentry metrics
- rag/: mem0, gemini-rag, gemini grounding metadata

Update README to document the new folders.
2026-03-31 13:15:06 -04:00
Mark Backman
e719cbbe6d Reorganize examples into topic-based subfolders
Move 304 examples from a flat numbered directory into 14 descriptive
subfolders: getting-started, services (speech + function-calling),
transcription, vision, realtime, persistent-context,
context-summarization, update-settings (stt/tts/llm), turn-management,
thinking-and-mcp, transports, video-avatar, video-processing, and
features.

Strip numbered prefixes from filenames (e.g. 07c-interruptible-deepgram.py
becomes services/speech/deepgram.py) since the folder context makes them
redundant. Keep numbered prefixes only in getting-started/ where ordering
matters.

Update eval script paths and README to match the new structure.
2026-03-31 13:12:24 -04:00
Mark Backman
f2ce7ececc Move foundational examples to examples/ 2026-03-31 13:12:24 -04:00
kompfner
bd7496fa27 Merge pull request #4211 from pipecat-ai/pk/openai-responses-websocket-service-refactor
Introduce WebsocketLLMService and refactor OpenAIResponsesLLMService …
2026-03-31 13:02:45 -04:00
Paul Kompfner
0a8bcf58c4 Register on_connection_error event handler in WebsocketLLMService 2026-03-31 10:52:33 -04:00
Paul Kompfner
0fb45c6114 Guard _drain_cancelled_response against None websocket 2026-03-31 10:32:47 -04:00
Paul Kompfner
657a5def57 Use consistent 'inference' terminology in error messages 2026-03-31 10:17:29 -04:00
Paul Kompfner
30903042e5 Work around OpenAI Python SDK temperature bug in example 2026-03-31 10:16:30 -04:00
Mark Backman
9936ec16cb Add changelog for #4213 2026-03-31 09:28:31 -04:00
Mark Backman
212aff15c9 Update default Google Imagen model to imagen-4.0-generate-001 2026-03-31 09:16:24 -04:00
Paul Kompfner
f2b3f87661 Clarify discrete vs continuous contrast in WebsocketLLMService docstring 2026-03-30 23:46:23 -04:00
Paul Kompfner
77cfb181f6 Clarify per-inference helper usage in WebsocketLLMService docstring 2026-03-30 23:25:56 -04:00
Paul Kompfner
0b256936c6 Add ConnectionClosed to _receive_response_events raises docstring 2026-03-30 23:14:45 -04:00
Paul Kompfner
3922963c7a Extract helpers in _process_context to reduce repeated code 2026-03-30 23:10:38 -04:00
Paul Kompfner
ab9f2a35b6 Clean up TTFB metrics and previous_response state on inference failure 2026-03-30 23:04:06 -04:00
Paul Kompfner
f19d1183d8 Clean up TTFB metrics and previous_response state on retry failure 2026-03-30 23:00:22 -04:00
Paul Kompfner
9ad4fe6344 Use concrete inference language instead of abstract transaction terminology 2026-03-30 22:42:40 -04:00
Paul Kompfner
04882f6f2a Simplify _connect_websocket guard and remove unused State import 2026-03-30 22:32:08 -04:00
Paul Kompfner
712e42533d Introduce WebsocketLLMService and refactor OpenAIResponsesLLMService to use it
Add WebsocketLLMService as a base class for WebSocket-based LLM services,
parallel to WebsocketTTSService/WebsocketSTTService but codifying a
transactional request-response model rather than a continuous background
receive loop.

WebsocketLLMService provides:
- Connection lifecycle (start/stop/cancel → connect/disconnect)
- _ws_send/_ws_recv with transparent ConnectionClosed handling
  (auto-reconnect via exponential backoff → WebsocketReconnectedError)
- _ensure_connected with retry via _try_reconnect

OpenAIResponsesLLMService now inherits from WebsocketLLMService, removing
duplicated connection management code (_connect, _disconnect, _reconnect,
_ensure_connected, _ws_send, start, stop, cancel) and simplifying
_process_context from a loop with attempt tracking to a flat try/except
with a single retry.
2026-03-30 22:26:31 -04:00
Mark Backman
7d8b436018 Add changelog for #4209 2026-03-30 21:40:17 -04:00
Mark Backman
bf1856f610 Change GrokLLMService default model from grok-3-beta to grok-3
The grok-3 model is now generally available, so update the default
from the beta variant.
2026-03-30 21:39:33 -04:00
Mark Backman
248e0a4c90 Merge pull request #4207 from pipecat-ai/mb/remove-krisp
Remove docs uses of krisp optional dependency
2026-03-30 19:54:14 -04:00
Mark Backman
89dcd57577 Remove docs uses of krisp optional dependency 2026-03-30 19:50:40 -04:00
Mark Backman
32022a952e Merge pull request #4205 from pipecat-ai/mb/remove-quickstart
Remove quickstart example from repo
2026-03-30 18:58:49 -04:00
Aleix Conchillo Flaqué
65d9fcc315 Merge pull request #4204 from pipecat-ai/aleix/remove-some-deprecations
Remove deprecated APIs and modules
2026-03-30 15:32:53 -07:00
Mark Backman
b78ae40d3c Remove quickstart example from repo 2026-03-30 18:20:41 -04:00
Aleix Conchillo Flaqué
ece4d0661e update uv.lock 2026-03-30 15:06:05 -07:00
Aleix Conchillo Flaqué
82a852c1ff Add changelog for #4204 2026-03-30 15:06:05 -07:00
Aleix Conchillo Flaqué
5be1b9c8cb LLMService: remove deprecated request_image_frame() 2026-03-30 15:06:05 -07:00
Aleix Conchillo Flaqué
7913d4e188 FrameProcessor: remove deprecated wait_for_task() 2026-03-30 14:45:42 -07:00
Aleix Conchillo Flaqué
c8dd7c2b57 rtvi: remove old deprecations 2026-03-30 14:44:32 -07:00
Aleix Conchillo Flaqué
77e5f4acc1 runner(daily): remove deprecated configure_with_args() 2026-03-30 14:31:39 -07:00
Aleix Conchillo Flaqué
be8d4dfd87 TTSService: remove deprecated say() function 2026-03-30 14:29:30 -07:00
Aleix Conchillo Flaqué
bb2c60a998 transports: remove deprecated vad_enabled and vad_audio_passthrough 2026-03-30 14:28:34 -07:00
Aleix Conchillo Flaqué
7c644ed810 RTVIObserver: remove deprecated errors_enabled 2026-03-30 14:26:53 -07:00
Aleix Conchillo Flaqué
96ceec2a43 transports: remove deprecated camera_in_* and camera_out_* params 2026-03-30 14:24:40 -07:00
Aleix Conchillo Flaqué
d249473f0b AudioBufferProcessor: remove deprecated user_continuous_stream 2026-03-30 14:22:21 -07:00
Aleix Conchillo Flaqué
1da2018c85 PipelineTask: remove deprecated on_pipeline_ended/cancelled/stopped 2026-03-30 14:20:45 -07:00
Aleix Conchillo Flaqué
af126ec7cf PipelineParams: remove deprecated observers field 2026-03-30 14:18:07 -07:00
Aleix Conchillo Flaqué
340e58bf5c LLMService: remove old function call single argument 2026-03-30 14:16:18 -07:00
Aleix Conchillo Flaqué
7873159d0f LLMService: remove start_callback 2026-03-30 14:13:23 -07:00
Aleix Conchillo Flaqué
c783101741 frames: remove deprecated interruption frames 2026-03-30 14:08:42 -07:00
Aleix Conchillo Flaqué
73b8bbf963 frames: remove deprecated transport frames 2026-03-30 14:08:24 -07:00
Aleix Conchillo Flaqué
ebbe5acc8f frames: remove deprecated KeypadEntryFrame 2026-03-30 14:07:54 -07:00
Aleix Conchillo Flaqué
dd1bea2a5f audio(turn): remove FalSmartTurnAnalyzer and LocalSmartTurnAnalyzer 2026-03-30 14:04:29 -07:00
Aleix Conchillo Flaqué
136e6a58be audio(utils): remove create_default_resampler 2026-03-30 14:02:13 -07:00
Aleix Conchillo Flaqué
f0d04dde1c audio(filters): remove KrispFilter 2026-03-30 14:01:06 -07:00
Aleix Conchillo Flaqué
742a278c05 audio(filters): remove NoisereduceFilter 2026-03-30 13:58:35 -07:00
Aleix Conchillo Flaqué
b16befc9e9 transports(daily): remove deprecated frames 2026-03-30 13:56:25 -07:00
kompfner
0c11eb6fd0 Merge pull request #4141 from pipecat-ai/pk/openai-responses-websocket-service
feat: add WebSocket-based OpenAI Responses LLM service
2026-03-30 15:25:32 -04:00
Mark Backman
ea39389e03 Add changelog for #4203 2026-03-30 14:24:49 -04:00
Mark Backman
4adf0fd585 Handle incomplete function call arguments from interrupted LLM streams
When a user interruption causes the LLM chunk stream to exit early,
function call arguments may be incomplete JSON. Wrap json.loads() in
try/except JSONDecodeError to skip malformed function calls with a
warning instead of crashing. Fixes #2461.
2026-03-30 14:24:04 -04:00
Mark Backman
465b9bcbc6 Add changelog for #4202 2026-03-30 14:16:21 -04:00
Mark Backman
3f4814cf84 Fix UTF-8 decode error in Inworld TTS streaming response
Buffer raw bytes and only decode after splitting on newline boundaries,
preventing multi-byte UTF-8 characters from being split at chunk edges.

Fixes #3538
2026-03-30 14:15:06 -04:00
Paul Kompfner
0efef19d60 Fix code review issues in WebSocket Responses service
- Use finally block in _disconnect to ensure state is always cleaned
  up, even if websocket.close() throws — prevents stale cancellation
  state (e.g. _cancel_pending_response) from polluting a new connection
- Catch ConnectionClosed in _drain_cancelled_response alongside
  TimeoutError — prevents _needs_drain from staying True and bricking
  the service on every subsequent inference attempt
- Fall back to OPENAI_API_KEY env var when api_key is not passed,
  since the WebSocket connection uses raw websockets (not the
  AsyncOpenAI client which handles this automatically)
- Use _clear_cancellation_state() instead of piecemeal resets where
  appropriate
2026-03-30 10:54:47 -04:00
Mark Backman
87b8f38a48 Merge pull request #4198 from pipecat-ai/mb/readme-update-2026-03-30
Add missing services to README available services table
2026-03-30 10:46:52 -04:00
Mark Backman
e1a3ddbb57 Add missing services to README available services table
Adds Kokoro (TTS), LiveKit and WhatsApp (Transport), Genesys
(Serializers), and Krisp Viva and RNNoise (Audio Processing).
2026-03-30 10:06:14 -04:00
Paul Kompfner
b5683556d4 Remove duplicate entries in run-release-evals.py, which appeared after a rebase 2026-03-30 10:03:43 -04:00
Paul Kompfner
26f85687d6 Handle response cancellation by draining before next inference
Instead of trying to filter stale events inline (unreliable — the API
doesn't provide a way to correlate events to a specific response),
drain remaining events from a cancelled response before starting the
next one. On cancellation, send response.cancel and set a drain flag.
At the start of the next _process_context, read and discard events
until a terminal event arrives, ensuring a clean connection. Falls
back to reconnecting if draining times out.
2026-03-30 09:59:03 -04:00
Paul Kompfner
670ce30a1c Document why HTTP variant doesn't use previous_response_id
Over HTTP, previous_response_id requires store=True (30-day OpenAI-side
conversation storage). The WebSocket variant avoids this via a
connection-local in-memory cache that works with store=False. Add
comments explaining this in both class docstrings, at the store=False
parameter, and in the adapter's previous_response_id note.
2026-03-30 09:59:03 -04:00
Paul Kompfner
1c8d31de70 Add trace logging for previous_response_id decisions and fix example
Add detailed trace-level logging to _apply_previous_response_optimization
showing why the optimization was applied or fell back to full context,
including the relevant data for debugging.

Use append_to_context=False for the filler TTSSpeakFrame in the
function-calling example to avoid altering the conversation history
and breaking the previous_response_id prefix match.
2026-03-30 09:59:03 -04:00
Paul Kompfner
9defff2a34 Skip server-known output items in previous_response_id optimization
When using previous_response_id, the server already knows its own
output from the previous response. Store the raw response output and,
on the next call, compare it against the items following the matched
input prefix — checking role and text content for messages, and call_id
for function calls. If the items match, skip them and send only truly
new input (user messages, tool results). Falls back to full context if
either the prefix or the output comparison fails.
2026-03-30 09:59:03 -04:00
Paul Kompfner
59d28f9fd2 Add changelog for WebSocket OpenAI Responses service 2026-03-30 09:59:03 -04:00
Paul Kompfner
f2a8a9e753 Add WebSocket-based OpenAI Responses LLM service with previous_response_id optimization
Introduce a WebSocket variant of the OpenAI Responses API service that
maintains a persistent connection to wss://api.openai.com/v1/responses
for lower-latency inference. The WebSocket variant automatically uses
previous_response_id to send only incremental context when possible,
falling back to full context on reconnection or cache miss.

The WebSocket variant becomes the new default OpenAIResponsesLLMService,
and the HTTP variant is renamed to OpenAIResponsesHttpLLMService. Both
share a private base class with common settings, parameter building,
and run_inference (always HTTP) logic.
2026-03-30 09:58:56 -04:00
Mark Backman
d1eb2699f3 Merge pull request #4192 from pipecat-ai/mb/update-langchain
Update langchain dependencies to latest major versions
2026-03-30 08:54:41 -04:00
Mark Backman
2e0f5fc6e9 Merge pull request #4194 from pipecat-ai/mb/update-community-integrations-package-convention
Add pipecat-{vendor} package naming convention to community guide
2026-03-30 08:52:28 -04:00
Mark Backman
dd3ca6fbba Merge pull request #4191 from pipecat-ai/mb/remove-openpipe
Remove OpenPipe integration
2026-03-30 08:52:14 -04:00
Mark Backman
171692aa30 Add pipecat-{vendor} package naming convention to community guide
Formalizes the package naming pattern that most community contributors
already follow organically, improving discoverability on PyPI.
2026-03-29 12:39:20 -04:00
Mark Backman
81ddd103f9 Fix KeyError on context messages without role in RTVI observer
Use dict.get() instead of direct key access to handle context messages
that don't have a 'role' key, such as tool results.
2026-03-29 10:28:00 -04:00
Mark Backman
8c9e189394 Fix langchain imports for langchain 1.x compatibility
ChatPromptTemplate moved from langchain.prompts to langchain_core.prompts
in langchain 1.x.
2026-03-29 10:27:48 -04:00
Mark Backman
b6579dc763 Update uv lock with latest versions of Pygments and cryptography 2026-03-29 10:20:45 -04:00
Mark Backman
abd63336e4 Add changelog for #4192 2026-03-29 10:18:52 -04:00
Mark Backman
ccb9dc20f8 Update langchain dependencies to latest major versions
Update langchain 0.3→1.2, langchain-community 0.3→0.4, and
langchain-openai 0.3→1.1. This also unblocks openai>=2.26 which
was previously constrained by the now-removed openpipe package.
2026-03-29 10:17:28 -04:00
Mark Backman
2177e28ee1 Remove OpenPipe integration
OpenPipe was acquired by CoreWeave in September 2025. The Python package
hasn't been updated since June 2025 and the repo since 2024. The openpipe
package caps openai<=1.97.1, creating dependency conflicts with other
extras. Remove the dead integration to clean up the codebase.
2026-03-29 10:12:35 -04:00
Mark Backman
3eb7c2bcd9 Merge pull request #4187 from OmerCohenAviv/fix/heartbeat-monitor-configurable
Fix heartbeat monitor timeout not respecting custom heartbeat interval
2026-03-29 09:31:12 -04:00
Mark Backman
878940f94e Merge pull request #4189 from Arindam200/main
Add NebiusLLMService for Nebius Token Factory
2026-03-29 09:03:06 -04:00
Mark Backman
a3aeafcb2d Alphabetize nebius entry in pyproject.toml extras 2026-03-29 08:58:01 -04:00
Mark Backman
63254fe337 Add NebiusLLMService with developer role and tool support fixes
- Add Nebius LLM service wrapping OpenAI-compatible Token Factory API
- Set supports_developer_role = False (Nebius rejects developer role)
- Default to openai/gpt-oss-120b model (supports function calling)
- Add Nebius function-calling example and env.example entry
- Fix Sarvam developer role support
- Update examples to use developer role for intro messages
2026-03-29 08:50:11 -04:00
Arindam200
39919f7889 Add NebiusLLMService for Nebius Token Factory
Adds an OpenAI-compatible LLM service for Nebius Token Factory, supporting
open-source models (Meta Llama, Qwen, DeepSeek) via their OpenAI-compatible
REST API at https://api.tokenfactory.nebius.com/v1/.
2026-03-29 14:35:46 +05:30
OmercohenAviv
f2e0f5d20c move wait_time out of loop 2026-03-29 00:05:21 +03:00
OmercohenAviv
2724ef6d6f non optional 2026-03-28 12:12:02 +03:00
OmercohenAviv
33fb8852e6 ruff 2026-03-28 12:05:30 +03:00
OmercohenAviv
5fe48da2fb Merge branch 'main' into fix/heartbeat-monitor-configurable 2026-03-28 11:57:23 +03:00
OmercohenAviv
dccd98ec8a test 2026-03-28 11:53:51 +03:00
Aleix Conchillo Flaqué
a84c69858e Merge pull request #4185 from pipecat-ai/changelog-0.0.108
Release 0.0.108 - Changelog Update
2026-03-27 21:47:53 -07:00
aconchillo
ca224219dc Update changelog for version 0.0.108 2026-03-27 21:43:37 -07:00
Aleix Conchillo Flaqué
83dc979d19 Merge pull request #4186 from pipecat-ai/mb/fix-websocket-disconnect-race-condition
Fix FastAPI WebSocket disconnect race condition
2026-03-27 21:40:21 -07:00
Aleix Conchillo Flaqué
fc76b3f2fb update pyproject.toml and uv.lock 2026-03-27 21:36:03 -07:00
Mark Backman
4670370dbb Add changelog for #4186 2026-03-28 00:02:44 -04:00
Mark Backman
47e53890e3 Fix FastAPI WebSocket disconnect race condition causing pipeline hang
When the remote side disconnects while send() is in flight, send() was
setting _closing=True. This prevented the receive loop from firing
on_client_disconnected, causing the pipeline to hang waiting for a
disconnect signal that never came.

The fix removes _closing from send() (that flag means we initiated the
close) and instead checks Starlette application_state in _can_send()
to suppress subsequent sends after a failure.

Fixes #3912
2026-03-28 00:01:25 -04:00
Aleix Conchillo Flaqué
195180b6f4 Merge pull request #4184 from pipecat-ai/aleix/fix-sarvam-examples-role
Fix Sarvam examples to use 'user' role instead of 'developer'
2026-03-27 20:34:59 -07:00
Aleix Conchillo Flaqué
8b64166bb7 Fix Sarvam examples to use 'user' role instead of 'developer'
Sarvam uses the OpenAI-compatible API but does not support the
'developer' role, causing errors. Use 'user' role instead.
2026-03-27 20:33:25 -07:00
Aleix Conchillo Flaqué
1d18995435 Merge pull request #4183 from pipecat-ai/aleix/fix-task-scheduling
Yield after create_task to ensure timer tasks are scheduled
2026-03-27 20:32:32 -07:00
Aleix Conchillo Flaqué
ea7324b2ba Add changelog for #4183 2026-03-27 19:03:55 -07:00
Aleix Conchillo Flaqué
52ed7137af Yield after create_task to ensure timer tasks are scheduled
Add `await asyncio.sleep(0)` after `create_task()` calls in
UserIdleController, SpeechTimeoutUserTurnStopStrategy,
TurnAnalyzerUserTurnStopStrategy, and UserTurnCompletionLLMServiceMixin
so the event loop schedules the newly created timer tasks before the
caller continues.
2026-03-27 19:03:23 -07:00
kompfner
b33df03724 Merge pull request #4179 from pipecat-ai/pk/fix-gemini-live-vertex
Don't send history_config for Gemini Live Vertex (unsupported)
2026-03-27 17:34:29 -04:00
Paul Kompfner
28fbe1db08 Don't send history_config for Gemini Live Vertex (unsupported) 2026-03-27 17:30:47 -04:00
kompfner
9240e92d9f Merge pull request #4177 from pipecat-ai/pk/tweak-26i-for-gemini-3.1-flash-live-support
Tweak 26i example system instruction for Gemini 3.1 Flash Live compat…
2026-03-27 17:20:06 -04:00
Paul Kompfner
5caf53f086 Tweak 26i example system instruction for Gemini 3.1 Flash Live compatibility
Gemini 3.1 Flash Live won't reliably report ending its turn until
after it says something following a tool call. Restructure the system
instruction so the model says goodbye *after* calling
end_conversation, and add a comment explaining the deferred EndFrame
behavior that makes this work.
2026-03-27 17:13:17 -04:00
Mark Backman
ac2716811c Merge pull request #4176 from pipecat-ai/mb/fix-websocket-rtvi-messages
Fix RTVI events not delivered over WebSocket transports
2026-03-27 16:50:37 -04:00
Mark Backman
d313d56776 Fix RTVI events not delivered over WebSocket transports
The base serializer filters out RTVI protocol messages by default
(ignore_rtvi_messages=True) to prevent them from being sent over
telephony media streams. ProtobufFrameSerializer is used by WebSocket
transports, which are the delivery channel for these messages, so
disable the filter there.
2026-03-27 16:47:11 -04:00
OmercohenAviv
de8ba68589 Fix heartbeat monitor timeout not respecting custom heartbeat interval
The heartbeat monitor timeout (`HEARTBEAT_MONITOR_SECS`) was a static
module-level constant that never derived from the user-configurable
`heartbeats_period_secs`. This meant overriding the heartbeat interval
had no effect on the monitor window, causing spurious warnings or
delayed detection depending on the configured interval.

Add a new `heartbeats_monitor_secs` parameter to `PipelineParams` so
the monitor timeout is independently configurable (defaults to 10s).
The monitor handler now reads from the instance param instead of the
hard-coded constant.

Made-with: Cursor
2026-03-27 19:41:06 +03:00
576 changed files with 4879 additions and 15168 deletions

View File

@@ -144,7 +144,7 @@ class InputParams(BaseModel):
#### Examples
Validated against `examples/foundational/07-interruptible.py`:
Validated against `examples/07-interruptible.py`:
- Proper `create_transport()` usage
- Correct pipeline structure

View File

@@ -42,7 +42,7 @@ jobs:
- name: Test uv sync with all extras
run: |
uv sync --group dev --all-extras --no-extra krisp
uv sync --group dev --all-extras
- name: Verify installation
run: |

View File

@@ -1,51 +0,0 @@
name: Sync Quickstart to pipecat-quickstart repo
on:
push:
branches: [main]
paths:
- 'examples/quickstart/**'
workflow_dispatch: # Manual trigger
jobs:
sync-quickstart:
runs-on: ubuntu-latest
steps:
- name: Checkout main repo
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Checkout quickstart repo
uses: actions/checkout@v4
with:
repository: pipecat-ai/pipecat-quickstart
token: ${{ secrets.QUICKSTART_SYNC_TOKEN }}
path: quickstart-repo
- name: Sync files (excluding uv.lock and README.md)
run: |
# Copy all files except uv.lock and README.md
find examples/quickstart -type f \
-not -name "README.md" \
-not -name "uv.lock" \
-exec cp {} quickstart-repo/ \;
- name: Commit and push changes
run: |
cd quickstart-repo
git config user.name "GitHub Action"
git config user.email "action@github.com"
git add .
# Only commit if there are changes
if ! git diff --staged --quiet; then
git commit -m "Sync from pipecat main repo
Updated files from examples/quickstart/
Commit: ${{ github.sha }}
"
git push
else
echo "No changes to sync"
fi

View File

@@ -1,8 +1,13 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.12.1
- repo: local
hooks:
- id: ruff
language_version: python3
args: [--fix]
name: ruff
entry: uv run ruff check --fix
language: system
types: [python]
- id: ruff-format
name: ruff-format
entry: uv run ruff format
language: system
types: [python]

View File

@@ -11,7 +11,7 @@ build:
jobs:
post_install:
- pip install uv
- UV_PROJECT_ENVIRONMENT=$READTHEDOCS_VIRTUALENV_PATH uv sync --group docs --all-extras --no-extra krisp --no-extra gstreamer --no-extra local_smart_turn --no-extra moondream --no-extra riva --no-extra mlx-whisper
- UV_PROJECT_ENVIRONMENT=$READTHEDOCS_VIRTUALENV_PATH uv sync --group docs --all-extras --no-extra gstreamer --no-extra local_smart_turn --no-extra moondream --no-extra riva --no-extra mlx-whisper
sphinx:
configuration: docs/api/conf.py

View File

@@ -7,6 +7,308 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
<!-- towncrier release notes start -->
## [0.0.108] - 2026-03-27
### Added
- Added `SarvamLLMService` with support for `sarvam-30b`, `sarvam-30b-16k`,
`sarvam-105b` and `sarvam-105b-32k`.
(PR [#3978](https://github.com/pipecat-ai/pipecat/pull/3978))
- Added `on_turn_context_created(context_id)` hook to `TTSService`. Override
this to perform provider-specific setup (e.g. eagerly opening a server-side
context) before text starts flowing. Called each time a new turn context ID
is created.
(PR [#4013](https://github.com/pipecat-ai/pipecat/pull/4013))
- Added `XAIHttpTTSService` for text-to-speech using xAI's HTTP TTS API.
(PR [#4031](https://github.com/pipecat-ai/pipecat/pull/4031))
- Added support for "developer" role messages in conversation context across
all LLM adapters. For non-OpenAI services (Anthropic, Google, AWS Bedrock),
"developer" messages are converted to "user" messages (use
`system_instruction` to set the system instruction). For OpenAI services,
"developer" messages pass through in conversation history. For the Responses
API, they are kept as "developer" role (matching the existing "system" →
"developer" conversion).
(PR [#4089](https://github.com/pipecat-ai/pipecat/pull/4089))
- Added `SmallestTTSService`, a WebSocket-based TTS service integration with
Smallest AI's Waves API. Supports the Lightning v2 and v3.1 models with
configurable voice, language, speed, consistency, similarity, and enhancement
settings.
(PR [#4092](https://github.com/pipecat-ai/pipecat/pull/4092))
- Added warnings in turn stop strategies when `VADParams.stop_secs` differs
from the recommended default (0.2s) or when `stop_secs >= STT p99 latency`,
which collapses the STT wait timeout to 0s and may cause delayed turn
detection. The warnings guide developers to re-run the
[stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) with their VAD
settings.
(PR [#4115](https://github.com/pipecat-ai/pipecat/pull/4115))
- Added `domain` parameter to `AssemblyAISTTSettings` for specialized
recognition modes such as Medical Mode (`domain="medical-v1"`).
(PR [#4117](https://github.com/pipecat-ai/pipecat/pull/4117))
- Added `NovitaLLMService` for using Novita AI's LLM models via their
OpenAI-compatible API.
(PR [#4119](https://github.com/pipecat-ai/pipecat/pull/4119))
- Added `cleanup()` method to `VADAnalyzer` and `VADController` so VAD analyzer
resources are properly released when no longer needed. Custom `VADAnalyzer`
subclasses can override `cleanup()` to free any held resources.
(PR [#4120](https://github.com/pipecat-ai/pipecat/pull/4120))
- Added `on_end_of_turn` event handler to `AssemblyAISTTService`. This fires
after the final transcript is pushed, providing a reliable hook for
end-of-turn logic that doesn't race with `TranscriptionFrame`. Works in both
Pipecat and AssemblyAI turn detection modes.
(PR [#4128](https://github.com/pipecat-ai/pipecat/pull/4128))
- Added `DeepgramFluxSageMakerSTTService` for running Deepgram Flux
speech-to-text on AWS SageMaker endpoints. Use with
`ExternalUserTurnStrategies` to take advantage of Flux's turn detection.
(PR [#4143](https://github.com/pipecat-ai/pipecat/pull/4143))
- Added `Mem0MemoryService.get_memories()` convenience method for retrieving
all stored memories outside the pipeline (e.g. to build a personalized
greeting at connection time). This avoids the need to manually handle client
type branching, filter construction, and async wrapping.
(PR [#4156](https://github.com/pipecat-ai/pipecat/pull/4156))
### Changed
- Added context prewarming path for `InworldTTSService` to improve first audio
latency.
(PR [#4013](https://github.com/pipecat-ai/pipecat/pull/4013))
- Added `KrispVivaVadAnalyzer` for Voice Activity Detection using the Krisp
VIVA SDK (requires `krisp_audio`).
(PR [#4022](https://github.com/pipecat-ai/pipecat/pull/4022))
- Modified `InworldTTSService` to close context at end of turn instead of
relying on idle timeout.
(PR [#4028](https://github.com/pipecat-ai/pipecat/pull/4028))
- Added Gemini 3 support to the Gemini Live service.
(PR [#4078](https://github.com/pipecat-ai/pipecat/pull/4078))
- `TTSService`: the default `stop_frame_timeout_s` (idle time before an
automatic `TTSStoppedFrame` is pushed when `push_stop_frames=True`) has
changed from `2.0` to `3.0` seconds.
(PR [#4084](https://github.com/pipecat-ai/pipecat/pull/4084))
- ⚠️ `GeminiLLMAdapter` now only treats `messages[0]` as the initial system
message, matching all other adapters. Previously it searched for the first
"system" message anywhere in the conversation history. A "system" message
appearing later in the list will now be converted to "user" instead of being
extracted as the system instruction.
(PR [#4089](https://github.com/pipecat-ai/pipecat/pull/4089))
- Fixed `InworldTtsService` to fallback to full text when TTS timestamps are
not received.
(PR [#4113](https://github.com/pipecat-ai/pipecat/pull/4113))
- ⚠️ Realtime services (Gemini Live, OpenAI Realtime, Grok Realtime, Nova
Sonic) now prefer `system_instruction` from service settings over an initial
system message in the LLM context, matching the behavior of non-realtime
services. Previously, context-provided system instructions took precedence. A
warning is now logged when both are set.
(PR [#4130](https://github.com/pipecat-ai/pipecat/pull/4130))
- Bumped `nvidia-riva-client` minimum version to `>=2.25.1`.
(PR [#4136](https://github.com/pipecat-ai/pipecat/pull/4136))
- Upgraded `protobuf` from 5.x to 6.x (`>=6.31.1,<7`).
(PR [#4136](https://github.com/pipecat-ai/pipecat/pull/4136))
- Unrecognized language strings (e.g. Deepgram's `"multi"`) no longer produce a
warning at startup. The log message has been downgraded to debug level since
these are valid service-specific values that are passed through correctly.
(PR [#4137](https://github.com/pipecat-ai/pipecat/pull/4137))
- `GrokLLMService` and `GrokRealtimeLLMService` now live in the
`pipecat.services.xai` module alongside `XAIHttpTTSService`, since all three
use the same xAI API. Update imports from `pipecat.services.grok.*` to
`pipecat.services.xai.*` (e.g. `from pipecat.services.xai.llm import
GrokLLMService`).
(PR [#4142](https://github.com/pipecat-ai/pipecat/pull/4142))
- ⚠️ Bumped `mem0ai` dependency from `~=0.1.94` to `>=1.0.8,<2`. Users of the
`mem0` extra will need to update their mem0ai package.
(PR [#4156](https://github.com/pipecat-ai/pipecat/pull/4156))
### Deprecated
- `pipecat.services.grok.llm`, `pipecat.services.grok.realtime.llm`, and
`pipecat.services.grok.realtime.events` are deprecated. The old import paths
still work but emit a `DeprecationWarning`; use `pipecat.services.xai.llm`,
`pipecat.services.xai.realtime.llm`, and
`pipecat.services.xai.realtime.events` instead.
(PR [#4142](https://github.com/pipecat-ai/pipecat/pull/4142))
### Removed
- ⚠️ `TTSService.add_word_timestamps()` no longer supports the `"Reset"` and
`"TTSStoppedFrame"` sentinel strings. If you have a custom TTS service that
called `await self.add_word_timestamps([("Reset", 0)])` or `await
self.add_word_timestamps([("TTSStoppedFrame", 0), ("Reset", 0)], ctx_id)`,
replace them with `await self.append_to_audio_context(ctx_id,
TTSStoppedFrame(context_id=ctx_id))` and let `_handle_audio_context` manage
the word-timestamp reset automatically.
(PR [#4145](https://github.com/pipecat-ai/pipecat/pull/4145))
- Removed `SambaNovaSTTService`. SambaNova no longer offers speech-to-text
audio models. Use another STT provider instead.
(PR [#4154](https://github.com/pipecat-ai/pipecat/pull/4154))
### Fixed
- Fixed Gemini Live (`GoogleGeminiLiveLLMService`) not honoring
`settings.system_instruction`. The system instruction was being read from a
deprecated constructor parameter instead of the settings object, causing it
to be silently ignored.
(PR [#4089](https://github.com/pipecat-ai/pipecat/pull/4089))
- Fixed `AWSBedrockLLMAdapter` sending an empty message list to the API when
the only message in context was a system message. The lone system message is
now converted to "user" role instead of being extracted, matching the
existing Anthropic adapter behavior.
(PR [#4089](https://github.com/pipecat-ai/pipecat/pull/4089))
- Fixed Gemini Live pipeline hanging indefinitely when an `EndFrame` was
deferred while waiting for the bot to finish responding and `turn_complete`
never arrived. As a possible root-cause fix, `turn_complete` messages are now
handled even if they lack `usage_metadata`. As a fallback, the deferred
`EndFrame` now has a 30-second safety timeout.
(PR [#4125](https://github.com/pipecat-ai/pipecat/pull/4125))
- Fixed ElevenLabs WebSocket disconnections (1008 "Maximum simultaneous
contexts exceeded") caused by rapid user interruptions. When interruptions
arrived before any TTS text was generated, phantom contexts were created on
the ElevenLabs server that were never closed, eventually exceeding the
5-context limit.
(PR [#4126](https://github.com/pipecat-ai/pipecat/pull/4126))
- Fixed the final sentence being dropped from the conversation context when
using RTVI text input with non-word-timestamp TTS services. The
`LLMFullResponseEndFrame` was racing ahead of the last `TTSTextFrame`,
causing the `LLMAssistantAggregator` to finalize the context before the final
sentence arrived.
(PR [#4127](https://github.com/pipecat-ai/pipecat/pull/4127))
- Fixed audio crackling and popping in recordings when both user and bot are
speaking. `AudioBufferProcessor` no longer injects silence into a track's
buffer while that track is actively producing audio, preventing mid-utterance
interruptions in the recorded output.
(PR [#4135](https://github.com/pipecat-ai/pipecat/pull/4135))
- Fixed websocket TTS word timestamps so interrupted contexts cannot leak stale
words or backward PTS values into later turns.
(PR [#4145](https://github.com/pipecat-ai/pipecat/pull/4145))
- Fixed a race condition in `InterruptibleTTSService` where, if `run_tts` had
been invoked but `BotStartedSpeakingFrame` had not yet been received, a user
interruption could allow stale audio to leak through.
(PR [#4145](https://github.com/pipecat-ai/pipecat/pull/4145))
- Fixed Gemini Live local VAD mode (`GeminiVADParams(disabled=True)` with
external VAD) not working. The bot now correctly detects user speech and
signals turn boundaries to the Gemini API.
(PR [#4146](https://github.com/pipecat-ai/pipecat/pull/4146))
- Fixed Gemini Live message handling to process all `server_content` fields
independently. Gemini 3.x can bundle multiple fields (e.g. `model_turn` and
`output_transcription`) on the same message, but the previous `elif` chain
only processed the first match, silently dropping the rest.
(PR [#4147](https://github.com/pipecat-ai/pipecat/pull/4147))
- Fixed `ServiceSwitcher` with `ServiceSwitcherStrategyFailover` incorrectly
triggering failover when `ErrorFrame`s from other pipeline stages (e.g. TTS)
propagated upstream through the switcher. Previously, any non-fatal error
passing through would be misattributed to the active service and trigger an
unwanted service switch. Now only errors originating from the switcher's own
managed services trigger failover.
(PR [#4149](https://github.com/pipecat-ai/pipecat/pull/4149))
- Fixed `LiveKitOutputTransport` not clearing the `rtc.AudioSource` internal
buffer on interruption, causing the bot to continue speaking for several
seconds after being interrupted.
(PR [#4151](https://github.com/pipecat-ai/pipecat/pull/4151))
- Fixed a crash in OpenAI LLM processing when the provider returns
`chunk.choices[0].delta.audio = None`, which caused `'NoneType' object has no
attribute 'get'` errors during audio transcript handling.
(PR [#4152](https://github.com/pipecat-ai/pipecat/pull/4152))
- Fixed error floods in `DeepgramSTTService` when the WebSocket connection
drops. With Deepgram SDK 6.x, `send_media()` raises exceptions on a dead
connection instead of silently failing, causing every queued audio frame to
log an error. Now `send_media()` failures are caught gracefully — a single
warning is logged and audio frames are skipped until the existing
reconnection logic restores the connection.
(PR [#4153](https://github.com/pipecat-ai/pipecat/pull/4153))
- `Mem0MemoryService` no longer blocks the event loop during memory storage and
retrieval. All Mem0 API calls now run in a background thread, and message
storage is fire-and-forget so it doesn't delay downstream processing.
(PR [#4156](https://github.com/pipecat-ai/pipecat/pull/4156))
- Fixed `Mem0MemoryService` failing to store messages when the context
contained system or developer role messages. The Mem0 API only accepts user
and assistant roles, so other roles are now filtered out before storing.
(PR [#4156](https://github.com/pipecat-ai/pipecat/pull/4156))
- Added missing `on_dtmf_event` callback to `LemonSliceTransportClient.setup()`
`DailyCallbacks` construction, fixing a `ValidationError` at pipeline setup
time.
(PR [#4161](https://github.com/pipecat-ai/pipecat/pull/4161))
- Fixed an issue in `InworldTTSService` where, in cases of fast interruption,
we would continue receiving audio from the previous context.
(PR [#4167](https://github.com/pipecat-ai/pipecat/pull/4167))
- Fixed a word timestamp interleaving issue in `InworldTTSService` when
processing multiple sentences.
(PR [#4167](https://github.com/pipecat-ai/pipecat/pull/4167))
- Fixed duplicate `TTSStoppedFrame` being pushed in TTS services using
`push_stop_frames=True`. When the stop-frame timeout fired, a second
`TTSStoppedFrame` could be pushed after the normal one at context completion.
(PR [#4172](https://github.com/pipecat-ai/pipecat/pull/4172))
- ⚠️ Fixed `DeepgramSTTService` compatibility with deepgram-sdk 6.1.0. The SDK
now requires explicit message objects for `send_keep_alive()`,
`send_close_stream()`, and `send_finalize()`. The minimum deepgram-sdk
version is now 6.1.0.
(PR [#4174](https://github.com/pipecat-ai/pipecat/pull/4174))
- Fixed RTVI events not being delivered to clients when using WebSocket
transports. `ProtobufFrameSerializer` now sets `ignore_rtvi_messages=False`
by default.
(PR [#4176](https://github.com/pipecat-ai/pipecat/pull/4176))
- Fixed a timing issue where turn detection timer tasks (idle controller,
speech timeout, turn analyzer, and turn completion) could miss their first
tick because the newly created asyncio task was not yet scheduled when the
caller continued.
(PR [#4183](https://github.com/pipecat-ai/pipecat/pull/4183))
- Fixed `FastAPIWebsocketTransport` intermittently hanging on shutdown when the
remote side (e.g. Twilio) disconnects while audio is being sent. A race
condition between the send and receive paths could cause the
`on_client_disconnected` callback to be skipped, leaving the pipeline waiting
for a disconnect signal that never came.
(PR [#4186](https://github.com/pipecat-ai/pipecat/pull/4186))
### Performance
- `RimeTTSService` now handles Rime's `done` WebSocket message to complete
audio contexts immediately, eliminating the 3-second idle timeout that
previously added latency at the end of each utterance.
(PR [#4172](https://github.com/pipecat-ai/pipecat/pull/4172))
## [0.0.107] - 2026-03-23
### Added

View File

@@ -10,7 +10,7 @@ Pipecat is an open-source Python framework for building real-time voice and mult
```bash
# Setup development environment
uv sync --group dev --all-extras --no-extra gstreamer --no-extra krisp
uv sync --group dev --all-extras --no-extra gstreamer
# Install pre-commit hooks
uv run pre-commit install

View File

@@ -23,7 +23,7 @@ Create your integration following the patterns and examples shown in the "Integr
Your repository must contain these components:
- **Source code** - Complete implementation following Pipecat patterns
- **Foundational example** - Single file example showing basic usage (see [Pipecat examples](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational))
- **Foundational example** - Single file example showing basic usage (see [Pipecat examples](https://github.com/pipecat-ai/pipecat/tree/main/examples))
- **README.md** - Must include:
- Introduction and explanation of your integration
- Installation instructions
@@ -225,6 +225,17 @@ Vision services process images and provide analysis such as descriptions, object
### Naming Conventions
#### Package and Repository Naming
Use the `pipecat-{vendor}` naming convention for your PyPI package and repository:
- `pipecat-{vendor}` — for single-service integrations (e.g., `pipecat-deepdub`)
- `pipecat-{vendor}-{type}` — when a vendor offers multiple service types (e.g., `pipecat-upliftai-stt`, `pipecat-upliftai-tts`)
This convention makes community packages easily discoverable via PyPI search and clearly identifies them as part of the Pipecat ecosystem.
#### Class Naming
- **STT:** `VendorSTTService`
- **LLM:** `VendorLLMService`
- **TTS:**
@@ -406,8 +417,9 @@ Use Pipecat's tracing decorators:
### Packaging and Distribution
- Name your package `pipecat-{vendor}` (see [Naming Conventions](#naming-conventions))
- Use [uv](https://docs.astral.sh/uv/) for packaging (encouraged)
- Consider releasing to PyPI for easier installation
- Publish to PyPI for easier installation
- Follow semantic versioning principles
- Maintain a changelog

View File

@@ -8,7 +8,7 @@
**Pipecat** is an open-source Python framework for building real-time voice and multimodal conversational agents. Orchestrate audio and video, AI services, different transports, and conversation pipelines effortlessly—so you can focus on what makes your agent unique.
> Want to dive right in? Try the [quickstart](https://docs.pipecat.ai/getting-started/quickstart).
> Want to dive right in? Run `pipecat init quickstart` or follow the [quickstart guide](https://docs.pipecat.ai/getting-started/quickstart).
## 🚀 What You Can Build
@@ -80,25 +80,25 @@ Catch new features, interviews, and how-tos on our [Pipecat TV](https://www.yout
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/storytelling-chatbot/image.png" width="400" /></a>
<br/>
<a href="https://github.com/pipecat-ai/pipecat-examples/tree/main/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat-examples/main/translation-chatbot/image.png" width="400" /></a>&nbsp;
<a href="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/12-describe-video.py"><img src="https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/assets/moondream.png" width="400" /></a>
<a href="https://github.com/pipecat-ai/pipecat/blob/main/examples/vision/vision-moondream.py"><img src="https://github.com/pipecat-ai/pipecat/blob/main/examples/assets/moondream.png" width="400" /></a>
</p>
## 🧩 Available services
| Category | Services |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Gradium](https://docs.pipecat.ai/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Sarvam](https://docs.pipecat.ai/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [Novita](https://docs.pipecat.ai/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Gradium](https://docs.pipecat.ai/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [Resemble](https://docs.pipecat.ai/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Smallest](https://docs.pipecat.ai/server/services/tts/smallest), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [xAI](https://docs.pipecat.ai/server/services/tts/xai), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/server/services/s2s/ultravox), |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), Local |
| Serializers | [Exotel](https://docs.pipecat.ai/server/utilities/serializers/exotel), [Plivo](https://docs.pipecat.ai/server/utilities/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/utilities/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/utilities/serializers/telnyx), [Vonage](https://docs.pipecat.ai/server/utilities/serializers/vonage) |
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [LemonSlice](https://docs.pipecat.ai/server/services/video/lemonslice), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/google-imagen), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp](https://docs.pipecat.ai/server/utilities/audio/krisp-filter), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
| Community | [Browse community integrations →](https://docs.pipecat.ai/server/services/community-integrations) |
| Category | Services |
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Speech-to-Text | [AssemblyAI](https://docs.pipecat.ai/server/services/stt/assemblyai), [AWS](https://docs.pipecat.ai/server/services/stt/aws), [Azure](https://docs.pipecat.ai/server/services/stt/azure), [Cartesia](https://docs.pipecat.ai/server/services/stt/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/stt/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/stt/elevenlabs), [Fal Wizper](https://docs.pipecat.ai/server/services/stt/fal), [Gladia](https://docs.pipecat.ai/server/services/stt/gladia), [Google](https://docs.pipecat.ai/server/services/stt/google), [Gradium](https://docs.pipecat.ai/server/services/stt/gradium), [Groq (Whisper)](https://docs.pipecat.ai/server/services/stt/groq), [NVIDIA Riva](https://docs.pipecat.ai/server/services/stt/riva), [OpenAI (Whisper)](https://docs.pipecat.ai/server/services/stt/openai), [Sarvam](https://docs.pipecat.ai/server/services/stt/sarvam), [Soniox](https://docs.pipecat.ai/server/services/stt/soniox), [Speechmatics](https://docs.pipecat.ai/server/services/stt/speechmatics), [Whisper](https://docs.pipecat.ai/server/services/stt/whisper) |
| LLMs | [Anthropic](https://docs.pipecat.ai/server/services/llm/anthropic), [AWS](https://docs.pipecat.ai/server/services/llm/aws), [Azure](https://docs.pipecat.ai/server/services/llm/azure), [Cerebras](https://docs.pipecat.ai/server/services/llm/cerebras), [DeepSeek](https://docs.pipecat.ai/server/services/llm/deepseek), [Fireworks AI](https://docs.pipecat.ai/server/services/llm/fireworks), [Gemini](https://docs.pipecat.ai/server/services/llm/gemini), [Grok](https://docs.pipecat.ai/server/services/llm/grok), [Groq](https://docs.pipecat.ai/server/services/llm/groq), [Mistral](https://docs.pipecat.ai/server/services/llm/mistral), [Nebius](https://docs.pipecat.ai/server/services/llm/nebius), [Novita](https://docs.pipecat.ai/server/services/llm/novita), [NVIDIA NIM](https://docs.pipecat.ai/server/services/llm/nvidia), [Ollama](https://docs.pipecat.ai/server/services/llm/ollama), [OpenAI](https://docs.pipecat.ai/server/services/llm/openai), [OpenRouter](https://docs.pipecat.ai/server/services/llm/openrouter), [Perplexity](https://docs.pipecat.ai/server/services/llm/perplexity), [Qwen](https://docs.pipecat.ai/server/services/llm/qwen), [SambaNova](https://docs.pipecat.ai/server/services/llm/sambanova), [Sarvam](https://docs.pipecat.ai/server/services/llm/sarvam), [Together AI](https://docs.pipecat.ai/server/services/llm/together) |
| Text-to-Speech | [Async](https://docs.pipecat.ai/server/services/tts/asyncai), [AWS](https://docs.pipecat.ai/server/services/tts/aws), [Azure](https://docs.pipecat.ai/server/services/tts/azure), [Camb AI](https://docs.pipecat.ai/server/services/tts/camb), [Cartesia](https://docs.pipecat.ai/server/services/tts/cartesia), [Deepgram](https://docs.pipecat.ai/server/services/tts/deepgram), [ElevenLabs](https://docs.pipecat.ai/server/services/tts/elevenlabs), [Fish](https://docs.pipecat.ai/server/services/tts/fish), [Google](https://docs.pipecat.ai/server/services/tts/google), [Gradium](https://docs.pipecat.ai/server/services/tts/gradium), [Groq](https://docs.pipecat.ai/server/services/tts/groq), [Hume](https://docs.pipecat.ai/server/services/tts/hume), [Inworld](https://docs.pipecat.ai/server/services/tts/inworld), [Kokoro](https://docs.pipecat.ai/server/services/tts/kokoro), [LMNT](https://docs.pipecat.ai/server/services/tts/lmnt), [MiniMax](https://docs.pipecat.ai/server/services/tts/minimax), [Neuphonic](https://docs.pipecat.ai/server/services/tts/neuphonic), [NVIDIA Riva](https://docs.pipecat.ai/server/services/tts/riva), [OpenAI](https://docs.pipecat.ai/server/services/tts/openai), [Piper](https://docs.pipecat.ai/server/services/tts/piper), [Resemble](https://docs.pipecat.ai/server/services/tts/resemble), [Rime](https://docs.pipecat.ai/server/services/tts/rime), [Sarvam](https://docs.pipecat.ai/server/services/tts/sarvam), [Smallest](https://docs.pipecat.ai/server/services/tts/smallest), [Speechmatics](https://docs.pipecat.ai/server/services/tts/speechmatics), [xAI](https://docs.pipecat.ai/server/services/tts/xai), [XTTS](https://docs.pipecat.ai/server/services/tts/xtts) |
| Speech-to-Speech | [AWS Nova Sonic](https://docs.pipecat.ai/server/services/s2s/aws), [Gemini Multimodal Live](https://docs.pipecat.ai/server/services/s2s/gemini), [Grok Voice Agent](https://docs.pipecat.ai/server/services/s2s/grok), [OpenAI Realtime](https://docs.pipecat.ai/server/services/s2s/openai), [Ultravox](https://docs.pipecat.ai/server/services/s2s/ultravox), |
| Transport | [Daily (WebRTC)](https://docs.pipecat.ai/server/services/transport/daily), [FastAPI Websocket](https://docs.pipecat.ai/server/services/transport/fastapi-websocket), [LiveKit (WebRTC)](https://docs.pipecat.ai/server/services/transport/livekit), [SmallWebRTCTransport](https://docs.pipecat.ai/server/services/transport/small-webrtc), [WebSocket Server](https://docs.pipecat.ai/server/services/transport/websocket-server), [WhatsApp](https://docs.pipecat.ai/server/services/transport/whatsapp), Local |
| Serializers | [Exotel](https://docs.pipecat.ai/server/services/serializers/exotel), [Genesys](https://docs.pipecat.ai/server/services/serializers/genesys), [Plivo](https://docs.pipecat.ai/server/services/serializers/plivo), [Twilio](https://docs.pipecat.ai/server/services/serializers/twilio), [Telnyx](https://docs.pipecat.ai/server/services/serializers/telnyx), [Vonage](https://docs.pipecat.ai/server/services/serializers/vonage) |
| Video | [HeyGen](https://docs.pipecat.ai/server/services/video/heygen), [LemonSlice](https://docs.pipecat.ai/server/services/transport/lemonslice), [Tavus](https://docs.pipecat.ai/server/services/video/tavus), [Simli](https://docs.pipecat.ai/server/services/video/simli) |
| Memory | [mem0](https://docs.pipecat.ai/server/services/memory/mem0) |
| Vision & Image | [fal](https://docs.pipecat.ai/server/services/image-generation/fal), [Google Imagen](https://docs.pipecat.ai/server/services/image-generation/google-imagen), [Moondream](https://docs.pipecat.ai/server/services/vision/moondream) |
| Audio Processing | [Silero VAD](https://docs.pipecat.ai/server/utilities/audio/silero-vad-analyzer), [Krisp Viva](https://docs.pipecat.ai/guides/features/krisp-viva), [Koala](https://docs.pipecat.ai/server/utilities/audio/koala-filter), [ai-coustics](https://docs.pipecat.ai/server/utilities/audio/aic-filter), [RNNoise](https://docs.pipecat.ai/server/utilities/audio/rnnoise-filter) |
| Analytics & Metrics | [OpenTelemetry](https://docs.pipecat.ai/server/utilities/opentelemetry), [Sentry](https://docs.pipecat.ai/server/services/analytics/sentry) |
| Community | [Browse community integrations →](https://docs.pipecat.ai/server/services/community-integrations) |
📚 [View full services documentation →](https://docs.pipecat.ai/server/services/supported-services)
@@ -142,7 +142,7 @@ You can get started with Pipecat running on your local machine, then move your a
## 🧪 Code examples
- [Foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) — small snippets that build on each other, introducing one or two concepts at a time
- [Foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples) — small snippets that build on each other, introducing one or two concepts at a time
- [Example apps](https://github.com/pipecat-ai/pipecat-examples) — complete applications that you can use as starting points for development
## 🛠️ Contributing to the framework
@@ -166,7 +166,6 @@ You can get started with Pipecat running on your local machine, then move your a
```bash
uv sync --group dev --all-extras \
--no-extra gstreamer \
--no-extra krisp \
--no-extra local \
```

View File

@@ -1 +0,0 @@
- Added `SarvamLLMService` with support for `sarvam-30b`, `sarvam-30b-16k`, `sarvam-105b` and `sarvam-105b-32k`

View File

@@ -1 +0,0 @@
- Added `on_turn_context_created(context_id)` hook to `TTSService`. Override this to perform provider-specific setup (e.g. eagerly opening a server-side context) before text starts flowing. Called each time a new turn context ID is created.

View File

@@ -1 +0,0 @@
- Added context prewarming path for `InworldTTSService` to improve first audio latency

View File

@@ -1 +0,0 @@
- Added `KrispVivaVadAnalyzer` for Voice Activity Detection using the Krisp VIVA SDK (requires `krisp_audio`).

View File

@@ -1 +0,0 @@
- Modeified `InworldTTSService` to close context at end of turn instead of relying on idle timeout

View File

@@ -1 +0,0 @@
- Added `XAIHttpTTSService` for text-to-speech using xAI's HTTP TTS API.

View File

@@ -1 +0,0 @@
- Added Gemini 3 support to the Gemini Live service.

View File

@@ -1 +0,0 @@
- `TTSService`: the default `stop_frame_timeout_s` (idle time before an automatic `TTSStoppedFrame` is pushed when `push_stop_frames=True`) has changed from `2.0` to `3.0` seconds.

View File

@@ -1 +0,0 @@
- Added support for "developer" role messages in conversation context across all LLM adapters. For non-OpenAI services (Anthropic, Google, AWS Bedrock), "developer" messages are converted to "user" messages (use `system_instruction` to set the system instruction). For OpenAI services, "developer" messages pass through in conversation history. For the Responses API, they are kept as "developer" role (matching the existing "system" → "developer" conversion).

View File

@@ -1 +0,0 @@
- ⚠️ `GeminiLLMAdapter` now only treats `messages[0]` as the initial system message, matching all other adapters. Previously it searched for the first "system" message anywhere in the conversation history. A "system" message appearing later in the list will now be converted to "user" instead of being extracted as the system instruction.

View File

@@ -1 +0,0 @@
- Fixed Gemini Live (`GoogleGeminiLiveLLMService`) not honoring `settings.system_instruction`. The system instruction was being read from a deprecated constructor parameter instead of the settings object, causing it to be silently ignored.

View File

@@ -1 +0,0 @@
- Fixed `AWSBedrockLLMAdapter` sending an empty message list to the API when the only message in context was a system message. The lone system message is now converted to "user" role instead of being extracted, matching the existing Anthropic adapter behavior.

View File

@@ -1 +0,0 @@
- Added `SmallestTTSService`, a WebSocket-based TTS service integration with Smallest AI's Waves API. Supports the Lightning v2 and v3.1 models with configurable voice, language, speed, consistency, similarity, and enhancement settings.

View File

@@ -1 +0,0 @@
- Fixed `InworldTtsService` to fallback to full text when TTS timestamps are not received

View File

@@ -1 +0,0 @@
- Added warnings in turn stop strategies when `VADParams.stop_secs` differs from the recommended default (0.2s) or when `stop_secs >= STT p99 latency`, which collapses the STT wait timeout to 0s and may cause delayed turn detection. The warnings guide developers to re-run the [stt-benchmark](https://github.com/pipecat-ai/stt-benchmark) with their VAD settings.

View File

@@ -1 +0,0 @@
- Added `domain` parameter to `AssemblyAISTTSettings` for specialized recognition modes such as Medical Mode (`domain="medical-v1"`).

View File

@@ -1 +0,0 @@
- Added `NovitaLLMService` for using Novita AI's LLM models via their OpenAI-compatible API.

View File

@@ -1 +0,0 @@
- Added `cleanup()` method to `VADAnalyzer` and `VADController` so VAD analyzer resources are properly released when no longer needed. Custom `VADAnalyzer` subclasses can override `cleanup()` to free any held resources.

View File

@@ -1 +0,0 @@
- Fixed Gemini Live pipeline hanging indefinitely when an `EndFrame` was deferred while waiting for the bot to finish responding and `turn_complete` never arrived. As a possible root-cause fix, `turn_complete` messages are now handled even if they lack `usage_metadata`. As a fallback, the deferred `EndFrame` now has a 30-second safety timeout.

View File

@@ -1 +0,0 @@
- Fixed ElevenLabs WebSocket disconnections (1008 "Maximum simultaneous contexts exceeded") caused by rapid user interruptions. When interruptions arrived before any TTS text was generated, phantom contexts were created on the ElevenLabs server that were never closed, eventually exceeding the 5-context limit.

View File

@@ -1 +0,0 @@
- Fixed the final sentence being dropped from the conversation context when using RTVI text input with non-word-timestamp TTS services. The `LLMFullResponseEndFrame` was racing ahead of the last `TTSTextFrame`, causing the `LLMAssistantAggregator` to finalize the context before the final sentence arrived.

View File

@@ -1 +0,0 @@
- Added `on_end_of_turn` event handler to `AssemblyAISTTService`. This fires after the final transcript is pushed, providing a reliable hook for end-of-turn logic that doesn't race with `TranscriptionFrame`. Works in both Pipecat and AssemblyAI turn detection modes.

View File

@@ -1 +0,0 @@
- ⚠️ Realtime services (Gemini Live, OpenAI Realtime, Grok Realtime, Nova Sonic) now prefer `system_instruction` from service settings over an initial system message in the LLM context, matching the behavior of non-realtime services. Previously, context-provided system instructions took precedence. A warning is now logged when both are set.

View File

@@ -1 +0,0 @@
- Fixed audio crackling and popping in recordings when both user and bot are speaking. `AudioBufferProcessor` no longer injects silence into a track's buffer while that track is actively producing audio, preventing mid-utterance interruptions in the recorded output.

View File

@@ -1 +0,0 @@
- Bumped `nvidia-riva-client` minimum version to `>=2.25.1`.

View File

@@ -1 +0,0 @@
- Upgraded `protobuf` from 5.x to 6.x (`>=6.31.1,<7`).

View File

@@ -1 +0,0 @@
- Unrecognized language strings (e.g. Deepgram's `"multi"`) no longer produce a warning at startup. The log message has been downgraded to debug level since these are valid service-specific values that are passed through correctly.

1
changelog/4141.added.md Normal file
View File

@@ -0,0 +1 @@
- ⚠️ Added WebSocket-based `OpenAIResponsesLLMService` as the new default for the OpenAI Responses API. It maintains a persistent connection to `wss://api.openai.com/v1/responses` and automatically uses `previous_response_id` to send only incremental context, falling back to full context on reconnection or cache miss. The previous HTTP-based implementation is now available as `OpenAIResponsesHttpLLMService`.

View File

@@ -1 +0,0 @@
- `GrokLLMService` and `GrokRealtimeLLMService` now live in the `pipecat.services.xai` module alongside `XAIHttpTTSService`, since all three use the same xAI API. Update imports from `pipecat.services.grok.*` to `pipecat.services.xai.*` (e.g. `from pipecat.services.xai.llm import GrokLLMService`).

View File

@@ -1 +0,0 @@
- `pipecat.services.grok.llm`, `pipecat.services.grok.realtime.llm`, and `pipecat.services.grok.realtime.events` are deprecated. The old import paths still work but emit a `DeprecationWarning`; use `pipecat.services.xai.llm`, `pipecat.services.xai.realtime.llm`, and `pipecat.services.xai.realtime.events` instead.

View File

@@ -1 +0,0 @@
- Added `DeepgramFluxSageMakerSTTService` for running Deepgram Flux speech-to-text on AWS SageMaker endpoints. Use with `ExternalUserTurnStrategies` to take advantage of Flux's turn detection.

View File

@@ -1 +0,0 @@
- Fixed websocket TTS word timestamps so interrupted contexts cannot leak stale words or backward PTS values into later turns.

View File

@@ -1 +0,0 @@
- Fixed a race condition in `InterruptibleTTSService` where, if `run_tts` had been invoked but `BotStartedSpeakingFrame` had not yet been received, a user interruption could allow stale audio to leak through.

View File

@@ -1 +0,0 @@
- ⚠️ `TTSService.add_word_timestamps()` no longer supports the `"Reset"` and `"TTSStoppedFrame"` sentinel strings. If you have a custom TTS service that called `await self.add_word_timestamps([("Reset", 0)])` or `await self.add_word_timestamps([("TTSStoppedFrame", 0), ("Reset", 0)], ctx_id)`, replace them with `await self.append_to_audio_context(ctx_id, TTSStoppedFrame(context_id=ctx_id))` and let `_handle_audio_context` manage the word-timestamp reset automatically.

View File

@@ -1 +0,0 @@
- Fixed Gemini Live local VAD mode (`GeminiVADParams(disabled=True)` with external VAD) not working. The bot now correctly detects user speech and signals turn boundaries to the Gemini API.

View File

@@ -1 +0,0 @@
- Fixed Gemini Live message handling to process all `server_content` fields independently. Gemini 3.x can bundle multiple fields (e.g. `model_turn` and `output_transcription`) on the same message, but the previous `elif` chain only processed the first match, silently dropping the rest.

View File

@@ -1 +0,0 @@
- Fixed `ServiceSwitcher` with `ServiceSwitcherStrategyFailover` incorrectly triggering failover when `ErrorFrame`s from other pipeline stages (e.g. TTS) propagated upstream through the switcher. Previously, any non-fatal error passing through would be misattributed to the active service and trigger an unwanted service switch. Now only errors originating from the switcher's own managed services trigger failover.

View File

@@ -1 +0,0 @@
- Fixed `LiveKitOutputTransport` not clearing the `rtc.AudioSource` internal buffer on interruption, causing the bot to continue speaking for several seconds after being interrupted.

View File

@@ -1 +0,0 @@
- Fixed a crash in OpenAI LLM processing when the provider returns `chunk.choices[0].delta.audio = None`, which caused `'NoneType' object has no attribute 'get'` errors during audio transcript handling.

View File

@@ -1 +0,0 @@
- Fixed error floods in `DeepgramSTTService` when the WebSocket connection drops. With Deepgram SDK 6.x, `send_media()` raises exceptions on a dead connection instead of silently failing, causing every queued audio frame to log an error. Now `send_media()` failures are caught gracefully — a single warning is logged and audio frames are skipped until the existing reconnection logic restores the connection.

View File

@@ -1 +0,0 @@
- Removed `SambaNovaSTTService`. SambaNova no longer offers speech-to-text audio models. Use another STT provider instead.

View File

@@ -1 +0,0 @@
- Added `Mem0MemoryService.get_memories()` convenience method for retrieving all stored memories outside the pipeline (e.g. to build a personalized greeting at connection time). This avoids the need to manually handle client type branching, filter construction, and async wrapping.

View File

@@ -1 +0,0 @@
- ⚠️ Bumped `mem0ai` dependency from `~=0.1.94` to `>=1.0.8,<2`. Users of the `mem0` extra will need to update their mem0ai package.

View File

@@ -1 +0,0 @@
- Fixed `Mem0MemoryService` failing to store messages when the context contained system or developer role messages. The Mem0 API only accepts user and assistant roles, so other roles are now filtered out before storing.

View File

@@ -1 +0,0 @@
- `Mem0MemoryService` no longer blocks the event loop during memory storage and retrieval. All Mem0 API calls now run in a background thread, and message storage is fire-and-forget so it doesn't delay downstream processing.

View File

@@ -1 +0,0 @@
- Added missing `on_dtmf_event` callback to `LemonSliceTransportClient.setup()` `DailyCallbacks` construction, fixing a `ValidationError` at pipeline setup time.

View File

@@ -1 +0,0 @@
- Fixed an issue in `InworldTTSService` where, in cases of fast interruption, we would continue receiving audio from the previous context.

View File

@@ -1 +0,0 @@
- Fixed a word timestamp interleaving issue in `InworldTTSService` when processing multiple sentences.

View File

@@ -1 +0,0 @@
- Fixed duplicate `TTSStoppedFrame` being pushed in TTS services using `push_stop_frames=True`. When the stop-frame timeout fired, a second `TTSStoppedFrame` could be pushed after the normal one at context completion.

View File

@@ -1 +0,0 @@
- `RimeTTSService` now handles Rime's `done` WebSocket message to complete audio contexts immediately, eliminating the 3-second idle timeout that previously added latency at the end of each utterance.

View File

@@ -1 +0,0 @@
- ⚠️ Fixed `DeepgramSTTService` compatibility with deepgram-sdk 6.1.0. The SDK now requires explicit message objects for `send_keep_alive()`, `send_close_stream()`, and `send_finalize()`. The minimum deepgram-sdk version is now 6.1.0.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `OpenPipeLLMService` and the `openpipe` extra. OpenPipe was acquired by CoreWeave and the package is no longer maintained. If you were using `openpipe` as an LLM provider, switch to the underlying provider directly (e.g. `openai`). The OpenPipe interface can still be used with `OpenAILLMService` by specifying a `base_url`.

View File

@@ -0,0 +1 @@
- ⚠️ Updated `langchain` extra to require langchain 1.x (from 0.3.x), langchain-community 0.4.x (from 0.3.x), and langchain-openai 1.x (from 0.3.x). If you pin these packages in your project, update your pins accordingly.

1
changelog/4202.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed `InworldHttpTTSService` streaming responses crashing with `UnicodeDecodeError` when multi-byte UTF-8 characters were split across chunk boundaries. This caused TTS audio to cut off mid-sentence intermittently.

1
changelog/4203.fixed.md Normal file
View File

@@ -0,0 +1 @@
- Fixed a crash (`JSONDecodeError`) when a user interruption occurs while the LLM is streaming function call arguments. Previously, the incomplete JSON arguments were passed directly to `json.loads()`, causing an unhandled exception. Affected services: OpenAI, Google (OpenAI-compatible), and SambaNova.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `observers` field from `PipelineParams`. Pass observers directly to `PipelineTask` constructor instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `on_pipeline_ended`, `on_pipeline_cancelled`, and `on_pipeline_stopped` events from `PipelineTask`. Use `on_pipeline_finished` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `AudioBufferProcessor.user_continuous_stream` parameter. Use `user_audio_passthrough` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `camera_in_enabled`, `camera_in_is_live`, `camera_in_width`, `camera_in_height`, `camera_out_enabled`, `camera_out_is_live`, `camera_out_width`, `camera_out_height`, and `camera_out_color` transport params. Use the `video_in_*` and `video_out_*` equivalents instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `RTVIObserver.errors_enabled` parameter.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `vad_enabled` and `vad_audio_passthrough` transport params.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `TTSService.say()`. Push a `TTSSpeakFrame` into the pipeline instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `DailyRunner.configure_with_args()`. Use `PipelineRunner` with `RunnerArguments` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated RTVI models, frames, and processor methods including `RTVIConfig`, `RTVIServiceConfig`, `RTVIServiceOptionConfig`, various `RTVI*Data` models, `RTVIActionFrame`, and `RTVIProcessor.handle_function_call`/`handle_function_call_start`. Use the updated RTVI processor API instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `FrameProcessor.wait_for_task()`. Use `create_task()` and manage tasks with the built-in `TaskManager` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `KrispFilter`. The `krisp` extra has been removed from `pyproject.toml`.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `LLMService.request_image_frame()`. Push a `UserImageRequestFrame` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `create_default_resampler()` from `pipecat.audio.utils`.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `FalSmartTurnAnalyzer` and `LocalSmartTurnAnalyzer`.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated transport frames: `TransportMessageFrame`, `TransportMessageUrgentFrame`, `InputTransportMessageUrgentFrame`, `DailyTransportMessageFrame`, and `DailyTransportMessageUrgentFrame`. Use `OutputTransportMessageFrame`, `OutputTransportMessageUrgentFrame`, `InputTransportMessageFrame`, `DailyOutputTransportMessageFrame`, and `DailyOutputTransportMessageUrgentFrame` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `KeypadEntryFrame` alias.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated interruption frames: `StartInterruptionFrame` and `BotInterruptionFrame`. Use `InterruptionFrame` and `InterruptionTaskFrame` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `LLMService.start_callback` parameter. Register an `on_llm_response_start` event handler instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed single-argument function call support from `LLMService`. Functions must use named parameters instead of a single `arguments` parameter.

View File

@@ -0,0 +1 @@
- ⚠️ Removed `NoisereduceFilter`. Use system-level noise reduction or a service-based alternative instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `pipecat.services.riva` package. Use `pipecat.services.nvidia.stt` and `pipecat.services.nvidia.tts` instead (`RivaSTTService``NvidiaSTTService`, `RivaTTSService``NvidiaTTSService`).

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `pipecat.services.nim` package. Use `pipecat.services.nvidia.llm` instead (`NimLLMService``NvidiaLLMService`).

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `pipecat.services.gemini_multimodal_live` package. Use `pipecat.services.google.gemini_live` instead. Note that class names no longer include "Multimodal" (e.g. `GeminiMultimodalLiveLLMService``GeminiLiveLLMService`).

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `pipecat.services.aws_nova_sonic` package. Use `pipecat.services.aws.nova_sonic` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `pipecat.services.openai_realtime` package. Use `pipecat.services.openai.realtime` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `OpenAIRealtimeBetaLLMService` and `AzureRealtimeBetaLLMService`. Use `OpenAIRealtimeLLMService` and `AzureRealtimeLLMService` from `pipecat.services.openai.realtime` and `pipecat.services.azure.realtime` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `pipecat.services.deepgram.stt_sagemaker` and `pipecat.services.deepgram.tts_sagemaker` modules. Use `pipecat.services.deepgram.sagemaker.stt` and `pipecat.services.deepgram.sagemaker.tts` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `GoogleLLMOpenAIBetaService` from `pipecat.services.google.openai`. Use `GoogleLLMService` from `pipecat.services.google.llm` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `pipecat.services.google.llm_vertex` module. Use `pipecat.services.google.vertex.llm` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `pipecat.services.google.gemini_live.llm_vertex` module. Use `pipecat.services.google.gemini_live.vertex.llm` instead.

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated `pipecat.services.ai_services` module. Import from `pipecat.services.ai_service`, `pipecat.services.llm_service`, `pipecat.services.stt_service`, `pipecat.services.tts_service`, etc. instead.

View File

@@ -0,0 +1 @@
- Changed `GrokLLMService` default model from `grok-3-beta` to `grok-3`, now that the model is generally available.

View File

@@ -0,0 +1 @@
- `GoogleImageGenService` now defaults to `imagen-4.0-generate-001` (previously `imagen-3.0-generate-002`).

View File

@@ -0,0 +1 @@
- ⚠️ `BaseOpenAILLMService.get_chat_completions()` now accepts an `LLMContext` instead of `OpenAILLMInvocationParams`. If you override this method, update your signature accordingly.

View File

@@ -0,0 +1,22 @@
- ⚠️ Removed deprecated service-specific context and aggregator machinery, which was superseded by the universal `LLMContext` system.
Service-specific classes removed: `AnthropicLLMContext`, `AnthropicContextAggregatorPair`, `AWSBedrockLLMContext`, `AWSBedrockContextAggregatorPair`, `OpenAIContextAggregatorPair`, and their user/assistant aggregators. Also removed `create_context_aggregator()` from `LLMService`, `OpenAILLMService`, `AnthropicLLMService`, and `AWSBedrockLLMService`.
Base aggregator classes removed (from `pipecat.processors.aggregators.llm_response`): `BaseLLMResponseAggregator`, `LLMContextResponseAggregator`, `LLMUserContextAggregator`, `LLMAssistantContextAggregator`, `LLMUserResponseAggregator`, `LLMAssistantResponseAggregator`.
From the developer's point of view, migrating will usually be a matter of going from this:
```python
context = OpenAILLMContext(messages, tools)
context_aggregator = llm.create_context_aggregator(context)
```
To this:
```python
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import LLMContextAggregatorPair
context = LLMContext(messages, tools)
context_aggregator = LLMContextAggregatorPair(context)
```

View File

@@ -0,0 +1 @@
- ⚠️ Removed deprecated frame types `LLMMessagesFrame` and `OpenAILLMContextAssistantTimestampFrame` from `pipecat.frames.frames`. Instead of `LLMMessagesFrame`, use `LLMContextFrame` with the new messages, or `LLMMessagesUpdateFrame` with `run_llm=True`.

Some files were not shown because too many files have changed in this diff Show More