diff --git a/docs/README.md b/docs/README.md index 9db83ac11..e5303dbd8 100644 --- a/docs/README.md +++ b/docs/README.md @@ -4,6 +4,10 @@ Learn about the thinking behind the SDK's design. +## [A Frame's Progress](frame-progress.md) + +See how a Frame is processed through a Transport, a Pipeline, and a series of Frame Processors. + ## [Example Code](examples/) The repo includes several example apps in the `examples` directory. The docs explain how they work. diff --git a/docs/frame-progress.md b/docs/frame-progress.md new file mode 100644 index 000000000..f4348bf88 --- /dev/null +++ b/docs/frame-progress.md @@ -0,0 +1,46 @@ +# A Frame's Progress + +1. A user says “Hello, LLM” and the cloud transcription service delivers a transcription to the Transport. +![A transcript frame arrives](images/frame-progress-01.png) + +2. The Transport places a Transcription frame in the Pipeline’s source queue. +![Frame in source queue](images/frame-progress-02.png) + +3. The Pipeline passes the Transcription frame to the first Frame Processor in its list, the LLM User Message Aggregator. +![To UMA](images/frame-progress-03.png) + +4. The LLM User Message Aggregator updates the LLM Context with a `{“user”: “Hello LLM”}` message. +![Update context](images/frame-progress-04.png) + +5. The LLM User Message Aggregator yields an LLM Message Frame, containing the updated LLM Context. The Pipeline passes this frame to the LLM Frame Processor. +![Update context](images/frame-progress-05.png) + +6. The LLM Frame Processor creates a streaming chat completion based on the LLM context and yields the first chunk of a response, Text Frame with the value “Hi, “. The Pipeline passes this frame to the TTS Frame Processor. The TTS Frame Processor aggregates this response but doesn’t yield anything, yet, because it’s waiting for a full sentence. +![LLM yields Text](images/frame-progress-06.png) + +7. The LLM Frame Processor yields another Text Frame with the value “there.”. The Pipeline passes this frame to the TTS Frame Processor. +![LLM yields more Text](images/frame-progress-07.png) + +8. The TTS Frame Processor now has a full sentence, so it starts streaming audio based on “Hi, there.” It yields the first chunk of streaming audio as an Audio frame, which the Pipeline passes to the LLM Assistant Message Aggregator. +![TTS yields Audio](images/frame-progress-08.png) + +9. The LLM Assistant Message Aggregator doesn’t do anything with Audio frames, so it immediately yields the frame, unchanged. This is the convention for all Frame Processors: frames that the processor doesn’t process should be immediately yielded. +![pass-through](images/frame-progress-09.png) + +10. The Pipeline places the first Audio frame in its sink queue, which is being watched by the Transport. Since the frame is now in a queue, the Pipeline can continue processing other frames. Note that the source and sink queues form a sort of “boundary of concurrent processing” between a Pipeline and the outside world. In a Pipeline, Frames are processed sequentially; once a Frame is on a queue it can be processed in parallel with the frames being processed by the Pipeline. TODO: link to a more in-depth section about this. +![sink queue](images/frame-progress-10.png) + +11. The TTS Frame Processor yields another Audio frame as the Transport transmits the first Audio frame. +![parallel audio](images/frame-progress-11.png) + +12. As before, the LLM Assistant Message Aggregator immediately yields the Audio frame and the Pipeline places the Audio frame in the sink queue. +![sink queue 2](images/frame-progress-12.png) + +13. The TTS Frame Processor has no more frames to yield. The LLM Frame Processor emits an LLM Response End Frame, which the Pipeline passes to the TTS Frame Processor. +![response end](images/frame-progress-13.png) + +14. The TTS Frame Processor immediately yields the LLM Response End Frame, so the Pipeline passes it along to the LLM Assistant Message Aggregator. The LLM Assistant Message Aggregator updates the LLM Context with the full response from the LLM. TODO TODO: I realized I forgot that the TSS Frame Processor also yields the Text frames that the LLM emitted so that the LLM Assistant Message Aggregator could accumulate them, arrggh. +![response end](images/frame-progress-14.png) + +15. The system is quiet, and waiting for the next message from the Transport. +![response end](images/frame-progress-15.png) diff --git a/docs/images/frame-progress-01.png b/docs/images/frame-progress-01.png new file mode 100644 index 000000000..f4f0a0e3e Binary files /dev/null and b/docs/images/frame-progress-01.png differ diff --git a/docs/images/frame-progress-02.png b/docs/images/frame-progress-02.png new file mode 100644 index 000000000..7eddb0c1d Binary files /dev/null and b/docs/images/frame-progress-02.png differ diff --git a/docs/images/frame-progress-03.png b/docs/images/frame-progress-03.png new file mode 100644 index 000000000..7579be4b0 Binary files /dev/null and b/docs/images/frame-progress-03.png differ diff --git a/docs/images/frame-progress-04.png b/docs/images/frame-progress-04.png new file mode 100644 index 000000000..b215c7ccc Binary files /dev/null and b/docs/images/frame-progress-04.png differ diff --git a/docs/images/frame-progress-05.png b/docs/images/frame-progress-05.png new file mode 100644 index 000000000..5fb2ef967 Binary files /dev/null and b/docs/images/frame-progress-05.png differ diff --git a/docs/images/frame-progress-06.png b/docs/images/frame-progress-06.png new file mode 100644 index 000000000..d39510c7c Binary files /dev/null and b/docs/images/frame-progress-06.png differ diff --git a/docs/images/frame-progress-07.png b/docs/images/frame-progress-07.png new file mode 100644 index 000000000..cdfc8b0ef Binary files /dev/null and b/docs/images/frame-progress-07.png differ diff --git a/docs/images/frame-progress-08.png b/docs/images/frame-progress-08.png new file mode 100644 index 000000000..382882d52 Binary files /dev/null and b/docs/images/frame-progress-08.png differ diff --git a/docs/images/frame-progress-09.png b/docs/images/frame-progress-09.png new file mode 100644 index 000000000..fd83bffa3 Binary files /dev/null and b/docs/images/frame-progress-09.png differ diff --git a/docs/images/frame-progress-10.png b/docs/images/frame-progress-10.png new file mode 100644 index 000000000..f4c25aaae Binary files /dev/null and b/docs/images/frame-progress-10.png differ diff --git a/docs/images/frame-progress-11.png b/docs/images/frame-progress-11.png new file mode 100644 index 000000000..f6f8f0fdc Binary files /dev/null and b/docs/images/frame-progress-11.png differ diff --git a/docs/images/frame-progress-12.png b/docs/images/frame-progress-12.png new file mode 100644 index 000000000..84b97555a Binary files /dev/null and b/docs/images/frame-progress-12.png differ diff --git a/docs/images/frame-progress-13.png b/docs/images/frame-progress-13.png new file mode 100644 index 000000000..40835eb6d Binary files /dev/null and b/docs/images/frame-progress-13.png differ diff --git a/docs/images/frame-progress-14.png b/docs/images/frame-progress-14.png new file mode 100644 index 000000000..228f0278d Binary files /dev/null and b/docs/images/frame-progress-14.png differ diff --git a/docs/images/frame-progress-15.png b/docs/images/frame-progress-15.png new file mode 100644 index 000000000..86ee47f8a Binary files /dev/null and b/docs/images/frame-progress-15.png differ