We now distinguish between input and output audio and image frames. We introduce
`InputAudioRawFrame`, `OutputAudioRawFrame`, `InputImageRawFrame` and
`OutputImageRawFrame` (and other subclasses of those). The input frames usually
come from an input transport and are meant to be processed inside the pipeline
to generate new frames. However, the input frames will not be sent through an
output transport. The output frames can also be processed by any frame processor
in the pipeline and they are allowed to be sent by the output transport.
1. Fleshed out MetricsFrames and broke it into a proper set of types
2. Add model_name as a property to the AIService so that it can be
automatically included in metrics and also remove that
overhead from all the various services themselves
Breaking change!
Because of the types improvements, the MetricsFrame type has
changed. Each frame will have a list of metrics simlilar to before
except each item in the list will only contain one type of metric:
"ttfb", "tokens", "characters", or "processing". Previously these
fields would be in every entry but set to None if they didn't apply.
While this changes internal handling of the MetricsFrame, it does NOT
break the RTVI/daily messaging of metrics. That format remains the same.
Also. Remember to use model_name for accessing a service's current
model and set_model_name for setting it.
Pipecat has a pipeline-based architecture. The pipeline consists of frame
processors linked to each other. The elements travelling across the pipeline are
called frames.
To have a deterministic behavior the frames travelling through the pipeline
should always be ordered, except system frames which are out-of-band frames. To
achieve that, each frame processor should only output frames from a single task.
There are synchronous and asynchronous frame processors. The synchronous
processors push output frames from the same task that they receive input frames,
and therefore only pushing frames from one task. Asynchrnous frame processors
can have internal tasks to perform things asynchrnously (e.g. receiving data
from a websocket) but they also have a single task where they push frames from.