pipecat

Author	SHA1	Message	Date
James Hush	8bbfa829d3	Remove wait	2025-11-26 12:27:02 +01:00
James Hush	c2eb663bdc	Add TurnAwareTranscriptProcessor for turn-based transcript tracking - Implements TurnAwareTranscriptProcessor that combines user and assistant transcript tracking with turn boundary detection - Correctly handles interruptions by capturing only what was actually spoken - Emits on_turn_started and on_turn_ended events with accumulated transcripts - Handles async frame processing with strategic delays to ensure proper text accumulation - Adds comprehensive tests covering basic flow, interruptions, and multiple turns - Includes documentation and usage examples	2025-11-26 12:26:25 +01:00
James Hush	bf055843e6	Fix race condition in DeepgramFluxSTTService reconnection Moved _receive_task and _watchdog_task creation from _connect_websocket() to _connect() to prevent multiple coroutines from attempting to receive from the websocket simultaneously during reconnection. Previously, when reconnection occurred, _connect_websocket() would be called while the existing _receive_task was still running, causing both to try to receive from the websocket. This resulted in the error: 'cannot call recv while another coroutine is already running recv or recv_streaming'. Now tasks are created only once during initial connection, and reconnection only re-establishes the websocket connection itself. This matches the pattern used by other websocket services in the codebase. Fixes issue reported in 0.0.95 where reconnection attempts would fail with recv errors.	2025-11-26 10:11:19 +01:00
Mark Backman	2607699664	Merge pull request #3125 from pipecat-ai/mb/fix-sagemaker-imports fix: remove stt_sagemaker import from deepgram/__init__.py	2025-11-24 21:31:31 -05:00
Mark Backman	47fa3b8556	Merge pull request #3108 from fbarril/livekit-transport-helper add livekit helper	2025-11-24 20:13:13 -05:00
Mark Backman	fa0100c38b	fix: remove stt_sagemaker import from deepgram/__init__.py	2025-11-24 20:04:18 -05:00
kompfner	e5142c1210	Merge pull request #3113 from pipecat-ai/pk/agentcore-processor Initial implementation of `AWSBedrockAgentCoreProcessor`	2025-11-24 19:10:44 -05:00
Paul Kompfner	5907b51c7d	In `AWSBedrockAgentCoreProcessor` use `self.create_task()`/`self.cancel_task()` instead of using `asyncio` directly.	2025-11-24 18:53:39 -05:00
Paul Kompfner	9e4ec4f7f3	Implement `AWSBedrockAgentCoreProcessor`	2025-11-24 18:53:35 -05:00
fbarril	e2161ea63d	add pyjwt as a livekit dependency	2025-11-24 23:30:11 +00:00
fbarril	7c81f66241	Merge remote-tracking branch 'origin/main' into livekit-transport-helper # Conflicts: # CHANGELOG.md # uv.lock	2025-11-24 23:29:22 +00:00
fbarril	60da466379	add pyjwt as a livekit dependency	2025-11-24 23:27:32 +00:00
fbarril	12c29b71f3	add entry to CHANGELOG.md	2025-11-24 23:27:13 +00:00
Mark Backman	b52b108932	Merge pull request #3118 from pipecat-ai/mb/deepgram-stt-sagemaker Add SageMaker BiDi client and DeepgramSageMakerSTTService	2025-11-24 16:47:25 -05:00
Mark Backman	a357ff0205	Alphabetize the project.optional-dependencies	2025-11-24 16:43:44 -05:00
Mark Backman	0ece8b5894	Add 07c Deepgram SageMaker example	2025-11-24 16:41:01 -05:00
Mark Backman	782b257bbb	Add DeepgramSageMakerSTTService	2025-11-24 16:41:01 -05:00
Mark Backman	ab8dcd6ede	Add SageMaker BiDi client	2025-11-24 16:41:00 -05:00
Mark Backman	012c2f7dde	Merge pull request #3106 from pipecat-ai/mb/update-11labs-realtime-stt Fix sample_rate issue in ElevenLabsRealtimeSTTService, add timestamps…	2025-11-24 08:10:30 -05:00
Mark Backman	87fdd8f006	Fix MiniMax changelog entries	2025-11-24 08:07:20 -05:00
Mark Backman	7bdac02837	Fix sample_rate issue in ElevenLabsRealtimeSTTService, add timestamps and logging	2025-11-24 08:06:33 -05:00
Mark Backman	861567bc59	Merge pull request #3119 from pipecat-ai/aleix/changelog-formatting format CHANGELOG	2025-11-24 08:05:11 -05:00
Aleix Conchillo Flaqué	d0ff43134a	format CHANGELOG	2025-11-23 17:48:57 -08:00
Dante Noguez	3458b74fc9	Fix 11labs realtime dynamic updates (#3117 )	2025-11-22 10:02:37 -05:00
mattie ruth backman	a6202c4d1a	Fixed CHANGELOG post rebase	2025-11-21 17:16:10 -05:00
mattie ruth backman	3c3141796a	Overlooked Changelog updates	2025-11-21 17:16:10 -05:00
mattie ruth backman	8b8b57b09c	Introduced new bot-output RTVI event to provide... a best effort version of the bot's output - The `RTVIObserver` now emits `bot-output` messages based off the new `AggregatedTextFrame`s (`bot-tts-text` and `bot-llm-text` are still supported and generated, but `bot-transcript` is now deprecated in lieu of this new, more thorough, message). - The new `RTVIBotOutputMessage` includes the fields: - `spoken`: A boolean indicating whether the text was spoken by TTS - `aggregated_by`: A string representing how the text was aggregated ("sentence", "word", "my custom aggregation") - Introduced new fields to `RTVIObserver` to support the new `bot-output` messaging: - `bot_output_enabled`: Defaults to True. Set to false to disable bot-output messages. - `skip_aggregator_types`: Defaults to `None`. Set to a list of strings that match aggregation types that should not be included in bot-output messages. (Ex. `credit_card`)	2025-11-21 17:16:10 -05:00
mattie ruth backman	4f30a48ecd	Rime and Cartesia TTS Updates: `CartesiaTTSService`: - Modified use of custom default text_aggregator to avoid deprecation warnings and push users towards use of transformers or the `LLMTextProcessor` - Added convenience methods for taking advantage of Cartesia's SSML tags: spell, emotion, pauses, volume, and speed. `RimeTTSService`: - Modified use of custom default text_aggregator to avoid deprecation warnings and push users towards use of transformers or the `LLMTextProcessor` - Added convenience methods for taking advantage of Rime's customization options: spell, pauses, pronunciations, and inline speed control.	2025-11-21 17:16:10 -05:00
mattie ruth backman	ecbc41045c	Added ability to transform text just-in-time before it gets sent to the TTS	2025-11-21 17:16:10 -05:00
mattie ruth backman	e1528d0f0c	Added support to TTS services to skip sending text to the... the actual TTS service to be spoken based on its aggregation type.	2025-11-21 17:16:10 -05:00
mattie ruth backman	6b6d760cf1	Introduced LLMTextProcessor and deprecatd custom text_aggregators in TTS Introduced `LLMTextProcessor`: A new processor meant to allow customization for how LLMTextFrames should be aggregated and considered. It's purpose is to turn `LLMTextFrame`s into `AggregatedTextFrame`s. By default, a TTSService will still aggregate `LLMTextFrame`s by sentence for the service to consume. However, if you wish to override how the llm text is aggregated, you should no longer override the TTS's internal text_aggregator, but instead, insert this processor between your LLM and TTS in the pipeline.	2025-11-21 17:16:10 -05:00
mattie ruth backman	7a4372a909	Introduced a new AggregatedTextFrame Frame type that TTSTextFrame inherits from This frame introduces an `aggregated_by` field to describe the type of text included in the frame and allows unspoken groupings of text to be pushed through the pipeline and treated similar to TTSTextFrames.	2025-11-21 17:16:10 -05:00
mattie ruth backman	0e820a01b9	Introduce `append_to_context` to `TextFrame`s Adding support for setting whether or not the text in the TextFrame should be added to the LLM context (by the LLM assistant aggregator). Defaults to `True`.	2025-11-21 17:16:10 -05:00
mattie ruth backman	24266c238f	Augmented PatternPairAggregator so that matched patterns can... be treated as their own aggregation, taking advantage of the new ability to assign a type to an aggregation	2025-11-21 17:16:10 -05:00
mattie ruth backman	dcc20f86e1	Updated the BaseTextAggregator to categorize aggregations Modified the BaseTextAggregator type so that when text gets aggregated, metadata can be associated with it. Currently, that just means a `type`, so that the aggregation can be classified or described. Changes made to support this: - IMPORTANT: Aggregators are now expected to strip leading/trailing white space characters before returning their aggregation from `aggregation()` or `.text`. This way all aggregators have a consistent contract allowing downstream use to know how to stitch aggregations back together - Introduced a new `Aggregation` dataclass to represent both the aggregated `text` and a string identifying the `type` of aggregation (ex. "sentence", "word", "my custom aggregation") - BREAKING: `BaseTextAggregator.text` now returns an `Aggregation` (instead of `str`). To update: `aggregated_text = myAggregator.text` -> `aggregated_text = myAggregator.text.text` - BREAKING: `BaseTextAggregator.aggregate()` now returns `Optional[Aggregation]` (instead of `Optional[str]`). To update: ``` aggregation = myAggregator.aggregate(text) if (aggregation): print(f"successfully aggregated text: {aggregation.text}") // instead of {aggregation} ``` - `SimpleTextAggregator`, `SkipTagsAggregator`, `PatternPairAggregator` updated to produce/consume `Aggregation` objects. - All uses of the above Aggregators have been updated accordingly.	2025-11-21 17:16:10 -05:00
fbarril	ec8964425a	add livekit helper	2025-11-21 00:27:57 +00:00
Vanessa Pyne	26918728df	Merge pull request #3096 from pipecat-ai/vp-minimax-2962-v2 minimax 2962 language updates	2025-11-20 10:41:35 -06:00
vipyne	954849379b	cleanup	2025-11-20 10:41:09 -06:00
vipyne	06542a2dbc	Update CHANGELOG	2025-11-20 10:41:09 -06:00
Vanessa Pyne	59d40eac45	Update src/pipecat/services/minimax/tts.py Co-authored-by: Mark Backman <mark@daily.co> add warning	2025-11-20 10:41:09 -06:00
vipyne	17cf6c56cf	minimax updates some `debug`s -> `trace`s add western US base_url to docs ensure error_message is defined add deprecation warning for `english_normalization` param	2025-11-20 10:41:09 -06:00
minimax	616e6ba351	docs(minimax): add API endpoint comment for west US region	2025-11-20 10:41:08 -06:00
minimax	f3cb5e0106	feat(minimax): comprehensive updates to TTS service - Add support for speech-2.6-hd and speech-2.6-turbo models - Add 16 new languages (total 40): Afrikaans, Bulgarian, Catalan, Danish, Persian, Filipino, Hebrew, Croatian, Hungarian, Malay, Norwegian, Nynorsk, Slovak, Slovenian, Swedish, Tamil - Add new emotions: calm and fluent - Add new parameters: text_normalization (renamed from english_normalization), latex_read, force_cbr, exclude_aggregated_audio, subtitle_enable, subtitle_type - Extract trace_id from response headers for all requests - Improve error handling for non-streaming error responses - Add detailed extra_info logging (audio_length, audio_size, usage_characters, word_count) - Add validation warnings for language/model compatibility - Fix silent error issue where HTTP 200 responses with errors were ignored BREAKING CHANGE: Renamed parameter english_normalization to text_normalization	2025-11-20 10:41:08 -06:00
Aleix Conchillo Flaqué	c89f230c99	fix CHANGELOG	2025-11-20 08:40:30 -08:00
Aleix Conchillo Flaqué	69cd5716cd	Merge pull request #3102 from pipecat-ai/aleix/daily-python-0.22.0 pyproject: update daily-python to 0.22.0	2025-11-20 08:35:39 -08:00
Mark Backman	ab58f72322	Merge pull request #3101 from hwuiwon/hw/inworld-talking-speed feat: Add speaking rate control to Inworld TTS service.	2025-11-20 09:50:55 -05:00
Hwuiwon Kim	ead361f665	fix	2025-11-20 07:45:13 -05:00
Aleix Conchillo Flaqué	fa6b8851ed	pyproject: update daily-python to 0.22.0	2025-11-19 21:56:38 -08:00
Hwuiwon Kim	1cc69d475d	feat: Add speaking rate control to Inworld TTS service & fix param cases	2025-11-19 22:57:53 -05:00
Mark Backman	51bdd8b728	Merge pull request #3097 from hwuiwon/fix-typo Fix typo in STT event handler documentation	2025-11-19 17:10:32 -05:00

1 2 3 4 5 ...

6425 Commits