Files
pipecat/tests/test_simple_text_aggregator.py
mattie ruth backman dcc20f86e1 Updated the BaseTextAggregator to categorize aggregations
Modified the BaseTextAggregator type so that when text gets aggregated, metadata can
be associated with it. Currently, that just means a `type`, so that the aggregation
can be classified or described. Changes made to support this:
  - **IMPORTANT**: Aggregators are now expected to strip leading/trailing white space
    characters before returning their aggregation from `aggregation()` or `.text`. This
    way all aggregators have a consistent contract allowing downstream use to know how
    to stitch aggregations back together
  - Introduced a new `Aggregation` dataclass to represent both the aggregated `text` and
    a string identifying the `type` of aggregation (ex. "sentence", "word", "my custom
    aggregation")
  - **BREAKING**: `BaseTextAggregator.text` now returns an `Aggregation` (instead of `str`).
    To update: `aggregated_text = myAggregator.text` -> `aggregated_text = myAggregator.text.text`
  - **BREAKING**: `BaseTextAggregator.aggregate()` now returns `Optional[Aggregation]`
    (instead of `Optional[str]`). To update:
      ```
      aggregation = myAggregator.aggregate(text)
      if (aggregation):
        print(f"successfully aggregated text: {aggregation.text}") // instead of {aggregation}
      ```
  - `SimpleTextAggregator`, `SkipTagsAggregator`, `PatternPairAggregator` updated to
     produce/consume `Aggregation` objects.
  - All uses of the above Aggregators have been updated accordingly.
2025-11-21 17:16:10 -05:00

36 lines
1.3 KiB
Python

#
# Copyright (c) 2024-2025 Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import unittest
from pipecat.utils.text.simple_text_aggregator import SimpleTextAggregator
class TestSimpleTextAggregator(unittest.IsolatedAsyncioTestCase):
def setUp(self):
self.aggregator = SimpleTextAggregator()
async def test_reset_aggregations(self):
assert await self.aggregator.aggregate("Hello ") == None
assert self.aggregator.text.text == "Hello"
await self.aggregator.reset()
assert self.aggregator.text.text == ""
async def test_simple_sentence(self):
assert await self.aggregator.aggregate("Hello ") == None
aggregate = await self.aggregator.aggregate("Pipecat!")
assert aggregate.text == "Hello Pipecat!"
assert aggregate.type == "sentence"
assert self.aggregator.text.text == ""
async def test_multiple_sentences(self):
aggregate = await self.aggregator.aggregate("Hello Pipecat! How are ")
assert aggregate.text == "Hello Pipecat!"
# Aggregators should strip leading/trailing spaces when returning text
assert self.aggregator.text.text == "How are"
aggregate = await self.aggregator.aggregate("you?")
assert aggregate.text == "How are you?"