# Understanding Different Frame Types in the Pipecat System In the Pipecat system, frames are used to represent different types of data and control signals that flow through the pipeline. Understanding these frame types is crucial for working with the system effectively. This tutorial will cover the main categories of frames and their specific uses. ## 1. Base Frame Classes ### Frame The `Frame` class is the base class for all frames. It includes: - `id`: A unique identifier - `name`: A descriptive name - `pts`: Presentation timestamp (optional) ### DataFrame `DataFrame` is a subclass of `Frame` and serves as a base for most data-carrying frames. ## 2. Audio Frames ### AudioRawFrame Represents a chunk of audio with properties: - `audio`: Raw audio data - `sample_rate`: Audio sample rate - `num_channels`: Number of audio channels Subclasses include: - `InputAudioRawFrame`: For audio from input sources - `OutputAudioRawFrame`: For audio to be played by output devices - `TTSAudioRawFrame`: For audio generated by Text-to-Speech services ## 3. Image Frames ### ImageRawFrame Represents an image with properties: - `image`: Raw image data - `size`: Image dimensions - `format`: Image format (e.g., JPEG, PNG) Subclasses include: - `InputImageRawFrame`: For images from input sources - `OutputImageRawFrame`: For images to be displayed - `UserImageRawFrame`: For images associated with a specific user - `VisionImageRawFrame`: For images with associated text for description - `URLImageRawFrame`: For images with an associated URL ### SpriteFrame Represents an animated sprite, containing a list of `ImageRawFrame` objects. ## 4. Text and Transcription Frames ### TextFrame Represents a chunk of text, used for various purposes in the pipeline. ### TranscriptionFrame A specialized `TextFrame` for speech transcriptions, including: - `user_id`: ID of the speaking user - `timestamp`: When the transcription was generated - `language`: Detected language of the speech ### InterimTranscriptionFrame Similar to `TranscriptionFrame`, but for interim (not final) transcriptions. ## 5. LLM (Language Model) Frames ### LLMMessagesFrame Contains a list of messages for an LLM service to process. ### LLMMessagesAppendFrame and LLMMessagesUpdateFrame Used to modify the current context of LLM messages. ### LLMSetToolsFrame Specifies tools (functions) available for the LLM to use. ### LLMEnablePromptCachingFrame Controls prompt caching in certain LLMs. ## 6. System and Control Frames ### SystemFrame Base class for system-level frames. Important system frames include: - `StartFrame`: Initiates a pipeline - `CancelFrame`: Stops a pipeline immediately - `ErrorFrame`: Notifies of errors (with `FatalErrorFrame` for unrecoverable errors) - `EndTaskFrame` and `CancelTaskFrame`: Control pipeline tasks - `StartInterruptionFrame` and `StopInterruptionFrame`: Indicate user speech for interruptions ### ControlFrame Base class for control-flow frames. Notable control frames: - `EndFrame`: Signals the end of a pipeline - `LLMFullResponseStartFrame` and `LLMFullResponseEndFrame`: Bracket LLM responses - `UserStartedSpeakingFrame` and `UserStoppedSpeakingFrame`: Indicate user speech activity - `BotStartedSpeakingFrame` and `BotStoppedSpeakingFrame`: Indicate bot speech activity - `TTSStartedFrame` and `TTSStoppedFrame`: Bracket Text-to-Speech responses ## 7. Special Purpose Frames ### AppFrame Base class for application-specific custom frames. ### MetricsFrame Contains performance metrics data. ### FunctionCallInProgressFrame and FunctionCallResultFrame Used for handling LLM function (tool) calls. ### ServiceUpdateSettingsFrame Base class for updating service settings, with specific subclasses for LLM, TTS, and STT services. ## Conclusion Understanding these frame types is essential for working with the Pipecat system. Each frame type serves a specific purpose in the pipeline, whether it's carrying data (like audio or images), controlling the flow of the pipeline, or managing system-level operations. By using the appropriate frame types, you can effectively process and transmit various kinds of information through your pipeline.