Merge pull request #845 from pipecat-ai/mb/more-docs-updates

Docs auto-gen improvements
This commit is contained in:
Mark Backman
2024-12-12 15:42:34 -05:00
committed by GitHub
16 changed files with 480 additions and 210 deletions

View File

@@ -1,47 +0,0 @@
name: Generate API Documentation
on:
release:
types: [published] # Run on new release
workflow_dispatch: # Manual trigger
jobs:
update-docs:
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r docs/api/requirements.txt
pip install .
- name: Generate API documentation
run: |
cd docs/api
python generate_docs.py
- name: Create Pull Request
uses: peter-evans/create-pull-request@v5
with:
commit-message: 'docs: Update API documentation'
title: 'docs: Update API documentation'
body: |
Automated PR to update API documentation.
- Generated using `docs/api/generate_docs.py`
- Triggered by: ${{ github.event_name }}
branch: update-api-docs
delete-branch: true
labels: |
documentation

View File

@@ -4,12 +4,65 @@ build:
os: ubuntu-22.04
tools:
python: '3.12'
jobs:
pre_build:
# Commands to run before the build
- python -m pip install --upgrade pip
- pip install wheel setuptools
post_build:
# Commands to run after the build
- echo "Build completed"
sphinx:
configuration: docs/api/conf.py
fail_on_warning: false # Set to true if you want builds to fail on warnings
python:
install:
- requirements: docs/api/requirements.txt
- requirements: docs/api/requirements-base.txt
# Try to install Riva first, fall back to PlayHT if it fails
- requirements: docs/api/requirements-riva.txt || true
- requirements: docs/api/requirements-playht.txt || true
- method: pip
path: .
extra_requirements:
- anthropic
- assemblyai
- aws
- azure
- canonical
- cartesia
- daily
- deepgram
- elevenlabs
- fal
- fireworks
- gladia
- google
- grok
- groq
- krisp
- langchain
- livekit
- lmnt
- local
- moondream
- nim
- noisereduce
- openai
- openpipe
- silero
- simli
- soundfile
- websocket
- whisper
search:
ranking:
api/*: 5
getting-started/*: 4
guides/*: 3
submodules:
include: all
recursive: true

109
docs/api/README.md Normal file
View File

@@ -0,0 +1,109 @@
# Pipecat Documentation
This directory contains the source files for auto-generating Pipecat's server API reference documentation.
## Setup
1. Install documentation dependencies:
```bash
pip install -r requirements-base.txt requirements-playht.txt requirements-riva.txt
```
2. Make the build scripts executable:
```bash
chmod +x build-docs.sh rtd-test.py
```
## Building Documentation
From this directory, you can build the documentation in several ways:
### Local Build
```bash
# Using the build script (automatically opens docs when done)
./build-docs.sh
# Or directly with sphinx-build
sphinx-build -b html . _build/html -W --keep-going
```
### ReadTheDocs Test Build
To test the documentation build process exactly as it would run on ReadTheDocs:
```bash
./rtd-test.py
```
This script:
- Creates a fresh virtual environment
- Installs all dependencies as specified in requirements files
- Handles conflicting dependencies (like grpcio versions for Riva and PlayHT)
- Builds the documentation in an isolated environment
- Provides detailed logging of the build process
Use this script to verify your documentation will build correctly on ReadTheDocs before pushing changes.
## Viewing Documentation
The built documentation will be available at `_build/html/index.html`. To open:
```bash
# On MacOS
open _build/html/index.html
# On Linux
xdg-open _build/html/index.html
# On Windows
start _build/html/index.html
```
## Directory Structure
```
.
├── api/ # Auto-generated API documentation
├── _build/ # Built documentation
├── _static/ # Static files (images, css, etc.)
├── conf.py # Sphinx configuration
├── index.rst # Main documentation entry point
├── requirements-base.txt # Base documentation dependencies
├── requirements-riva.txt # Riva-specific dependencies
├── requirements-playht.txt # PlayHT-specific dependencies
├── build-docs.sh # Local build script
└── rtd-test.py # ReadTheDocs test build script
```
## Notes
- Documentation is auto-generated from Python docstrings
- Service modules are automatically detected and included
- The build process matches our ReadTheDocs configuration
- Warnings are treated as errors (-W flag) to maintain consistency
- The --keep-going flag ensures all errors are reported
- Dependencies are split into multiple requirements files to handle version conflicts
## Troubleshooting
If you encounter missing service modules:
1. Verify the service is installed with its extras: `pip install pipecat-ai[service-name]`
2. Check the build logs for import errors
3. Ensure the service module is properly initialized in the package
4. Run `./rtd-test.py` to test in an isolated environment matching ReadTheDocs
For dependency conflicts:
1. Check the requirements files for version specifications
2. Use `rtd-test.py` to verify dependency resolution
3. Consider adding service-specific requirements files if needed
For more information:
- [ReadTheDocs Configuration](.readthedocs.yaml)
- [Sphinx Documentation](https://www.sphinx-doc.org/)

10
docs/api/build-docs.sh Executable file
View File

@@ -0,0 +1,10 @@
#!/bin/bash
# Clean previous build
rm -rf _build
# Build docs matching ReadTheDocs configuration
sphinx-build -b html -d _build/doctrees . _build/html -W --keep-going
# Open docs (MacOS)
open _build/html/index.html

View File

@@ -1,6 +1,11 @@
import logging
import sys
from pathlib import Path
# Configure logging
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
logger = logging.getLogger("sphinx-build")
# Add source directory to path
docs_dir = Path(__file__).parent
project_root = docs_dir.parent.parent
@@ -32,13 +37,144 @@ autodoc_default_options = {
"undoc-members": True,
"exclude-members": "__weakref__",
"no-index": True,
"show-inheritance": True,
}
# Mock imports for optional dependencies
autodoc_mock_imports = [
"riva",
"livekit",
"pyht",
"anthropic",
"assemblyai",
"boto3",
"azure",
"cartesia",
"deepgram",
"elevenlabs",
"fal",
"gladia",
"google",
"krisp",
"langchain",
"lmnt",
"noisereduce",
"openai",
"openpipe",
"simli",
"soundfile",
# Add these new mocks
"pyaudio",
"_tkinter",
"tkinter",
"daily",
"daily_python",
"pydantic.BaseModel", # Mock base pydantic to avoid model conflicts
"pydantic.Field",
"pydantic._internal._model_construction",
"pydantic._internal._fields",
]
# HTML output settings
html_theme = "sphinx_rtd_theme"
html_static_path = ["_static"]
autodoc_typehints = "description"
html_show_sphinx = False # Remove "Built with Sphinx"
html_show_sphinx = False
def verify_modules():
"""Verify that required modules are available."""
required_modules = {
"services": [
"assemblyai",
"aws",
"cartesia",
"deepgram",
"google",
"lmnt",
"riva",
"simli",
],
"serializers": ["livekit"],
"vad": ["silero", "vad_analyzer"],
"transports": {
"services": ["daily", "livekit"],
"local": ["audio", "tk"],
"network": ["fastapi_websocket", "websocket_server"],
},
}
missing = []
for category, modules in required_modules.items():
if isinstance(modules, dict):
# Handle nested structure
for subcategory, submodules in modules.items():
for module in submodules:
try:
__import__(f"pipecat.{category}.{subcategory}.{module}")
logger.info(
f"Successfully imported pipecat.{category}.{subcategory}.{module}"
)
except (ImportError, TypeError, NameError) as e:
missing.append(f"pipecat.{category}.{subcategory}.{module}")
logger.warning(
f"Optional module not available: pipecat.{category}.{subcategory}.{module} - {str(e)}"
)
else:
# Handle flat structure
for module in modules:
try:
__import__(f"pipecat.{category}.{module}")
logger.info(f"Successfully imported pipecat.{category}.{module}")
except (ImportError, TypeError, NameError) as e:
missing.append(f"pipecat.{category}.{module}")
logger.warning(
f"Optional module not available: pipecat.{category}.{module} - {str(e)}"
)
if missing:
logger.warning(f"Some optional modules are not available: {missing}")
def clean_title(title: str) -> str:
"""Automatically clean module titles."""
# Remove everything after space (like 'module', 'processor', etc.)
title = title.split(" ")[0]
# Get the last part of the dot-separated path
parts = title.split(".")
title = parts[-1]
# Special cases for service names and common acronyms
special_cases = {
"ai": "AI",
"aws": "AWS",
"api": "API",
"vad": "VAD",
"assemblyai": "AssemblyAI",
"deepgram": "Deepgram",
"elevenlabs": "ElevenLabs",
"openai": "OpenAI",
"openpipe": "OpenPipe",
"playht": "PlayHT",
"xtts": "XTTS",
"lmnt": "LMNT",
}
# Check if the entire title is a special case
if title.lower() in special_cases:
return special_cases[title.lower()]
# Otherwise, capitalize each word
words = title.split("_")
cleaned_words = []
for word in words:
if word.lower() in special_cases:
cleaned_words.append(special_cases[word.lower()])
else:
cleaned_words.append(word.capitalize())
return " ".join(cleaned_words)
def setup(app):
@@ -55,24 +191,56 @@ def setup(app):
import shutil
shutil.rmtree(output_dir)
logger.info(f"Cleaned existing documentation in {output_dir}")
print(f"Generating API documentation...")
print(f"Output directory: {output_dir}")
print(f"Source directory: {source_dir}")
logger.info(f"Generating API documentation...")
logger.info(f"Output directory: {output_dir}")
logger.info(f"Source directory: {source_dir}")
# Similar exclusions as in your generate_docs.py
excludes = [
str(project_root / "src/pipecat/pipeline/to_be_updated"),
str(project_root / "src/pipecat/processors/gstreamer"),
str(project_root / "src/pipecat/transports/network"),
str(project_root / "src/pipecat/transports/services"),
str(project_root / "src/pipecat/transports/local"),
str(project_root / "src/pipecat/services/to_be_updated"),
str(project_root / "src/pipecat/vad"), # deprecated
"**/test_*.py",
"**/tests/*.py",
]
try:
main(["-f", "-e", "-M", "--no-toc", "-o", output_dir, source_dir] + excludes)
print("API documentation generated successfully!")
main(
[
"-f", # Force overwriting
"-e", # Don't generate empty files
"-M", # Put module documentation before submodule documentation
"--no-toc", # Don't create a table of contents file
"--separate", # Put documentation for each module in its own page
"--module-first", # Module documentation before submodule documentation
"--implicit-namespaces", # Added: Handle implicit namespace packages
"-o",
output_dir,
source_dir,
]
+ excludes
)
logger.info("API documentation generated successfully!")
# Process generated RST files to update titles
for rst_file in Path(output_dir).glob("**/*.rst"): # Changed to recursive glob
content = rst_file.read_text()
lines = content.split("\n")
# Find and clean up the title
if lines and "=" in lines[1]: # Title is typically the first line
old_title = lines[0]
new_title = clean_title(old_title)
content = content.replace(old_title, new_title)
rst_file.write_text(content)
logger.info(f"Updated title: {old_title} -> {new_title}")
except Exception as e:
print(f"Error generating API documentation: {e}")
logger.error(f"Error generating API documentation: {e}", exc_info=True)
# Run module verification
verify_modules()

View File

@@ -1,104 +0,0 @@
#!/usr/bin/env python3
import shutil
import subprocess
from pathlib import Path
def run_command(command: list[str]) -> None:
"""Run a command and exit if it fails."""
print(f"Running: {' '.join(command)}")
try:
subprocess.run(command, check=True)
except subprocess.CalledProcessError as e:
print(f"Warning: Command failed: {' '.join(command)}")
print(f"Error: {e}")
def main():
docs_dir = Path(__file__).parent
project_root = docs_dir.parent.parent
# Install documentation requirements
requirements_file = docs_dir / "requirements.txt"
run_command(["pip", "install", "-r", str(requirements_file)])
# Install from project root, not docs directory
run_command(["pip", "install", "-e", str(project_root)])
# Install all service dependencies
services = [
"anthropic",
"assemblyai",
"aws",
"azure",
"canonical",
"cartesia",
# "daily",
"deepgram",
"elevenlabs",
"fal",
"fireworks",
"gladia",
"google",
"grok",
"groq",
"langchain",
# "livekit",
"lmnt",
"moondream",
"nim",
"noisereduce",
"openai",
"openpipe",
"playht",
"silero",
"soundfile",
"websocket",
"whisper",
]
extras = ",".join(services)
try:
run_command(["pip", "install", "-e", f"{str(project_root)}[{extras}]"])
except Exception as e:
print(f"Warning: Some dependencies failed to install: {e}")
# Clean old files
api_dir = docs_dir / "api"
build_dir = docs_dir / "_build"
for dir in [api_dir, build_dir]:
if dir.exists():
shutil.rmtree(dir)
# Generate API documentation
run_command(
[
"sphinx-apidoc",
"-f", # Force overwrite
"-e", # Put each module on its own page
"-M", # Put module documentation before submodule
"--no-toc", # Don't generate modules.rst (cleaner structure)
"-o",
str(api_dir), # Output directory
str(project_root / "src/pipecat"),
# Exclude problematic files and directories
str(project_root / "src/pipecat/processors/gstreamer"), # Optional gstreamer
str(project_root / "src/pipecat/transports/network"), # Pydantic issues
str(project_root / "src/pipecat/transports/services"), # Pydantic issues
str(project_root / "src/pipecat/transports/local"), # Optional dependencies
str(project_root / "src/pipecat/services/to_be_updated"), # Exclude to_be_updated
"**/test_*.py", # Test files
"**/tests/*.py", # Test files
]
)
# Build HTML documentation
run_command(["sphinx-build", "-b", "html", str(docs_dir), str(build_dir / "html")])
print("\nDocumentation generated successfully!")
print(f"HTML docs: {build_dir}/html/index.html")
if __name__ == "__main__":
main()

View File

@@ -13,61 +13,61 @@ Quick Links
* `GitHub Repository <https://github.com/pipecat-ai/pipecat>`_
* `Website <https://pipecat.ai>`_
API Reference
-------------
Core Components
~~~~~~~~~~~~~~~
* :mod:`pipecat.frames`
* :mod:`pipecat.processors`
* :mod:`pipecat.pipeline`
* :mod:`Frames <pipecat.frames>`
* :mod:`Processors <pipecat.processors>`
* :mod:`Pipeline <pipecat.pipeline>`
Audio Processing
~~~~~~~~~~~~~~~~
* :mod:`pipecat.audio`
* :mod:`pipecat.vad`
* :mod:`Audio <pipecat.audio>`
Services
~~~~~~~~
* :mod:`pipecat.services`
* :mod:`Services <pipecat.services>`
Transport & Serialization
~~~~~~~~~~~~~~~~~~~~~~~~~
* :mod:`pipecat.transports`
* :mod:`pipecat.serializers`
* :mod:`Transports <pipecat.transports>`
* :mod:`Local <pipecat.transports.local>`
* :mod:`Network <pipecat.transports.network>`
* :mod:`Services <pipecat.transports.services>`
* :mod:`Serializers <pipecat.serializers>`
Utilities
~~~~~~~~~
* :mod:`pipecat.clocks`
* :mod:`pipecat.metrics`
* :mod:`pipecat.sync`
* :mod:`pipecat.transcriptions`
* :mod:`pipecat.utils`
* :mod:`Clocks <pipecat.clocks>`
* :mod:`Metrics <pipecat.metrics>`
* :mod:`Sync <pipecat.sync>`
* :mod:`Transcriptions <pipecat.transcriptions>`
* :mod:`Utils <pipecat.utils>`
.. toctree::
:maxdepth: 2
:maxdepth: 3
:caption: API Reference
:hidden:
api/pipecat.audio
api/pipecat.clocks
api/pipecat.frames
api/pipecat.metrics
api/pipecat.pipeline
api/pipecat.processors
api/pipecat.serializers
api/pipecat.services
api/pipecat.sync
api/pipecat.transcriptions
api/pipecat.transports
api/pipecat.utils
api/pipecat.vad
Audio <api/pipecat.audio>
Clocks <api/pipecat.clocks>
Frames <api/pipecat.frames>
Metrics <api/pipecat.metrics>
Pipeline <api/pipecat.pipeline>
Processors <api/pipecat.processors>
Serializers <api/pipecat.serializers>
Services <api/pipecat.services>
Sync <api/pipecat.sync>
Transcriptions <api/pipecat.transcriptions>
Transports <api/pipecat.transports>
Utils <api/pipecat.utils>
Indices and tables
==================

View File

@@ -0,0 +1,38 @@
# Sphinx dependencies
sphinx>=8.1.3
sphinx-rtd-theme
sphinx-markdown-builder
sphinx-autodoc-typehints
toml
# Install all extras individually to ensure they're properly resolved
pipecat-ai[anthropic]
pipecat-ai[assemblyai]
pipecat-ai[aws]
pipecat-ai[azure]
pipecat-ai[canonical]
pipecat-ai[cartesia]
pipecat-ai[daily]
pipecat-ai[deepgram]
pipecat-ai[elevenlabs]
pipecat-ai[fal]
pipecat-ai[fireworks]
pipecat-ai[gladia]
pipecat-ai[google]
pipecat-ai[grok]
pipecat-ai[groq]
pipecat-ai[krisp]
pipecat-ai[langchain]
pipecat-ai[livekit]
pipecat-ai[lmnt]
pipecat-ai[local]
pipecat-ai[moondream]
pipecat-ai[nim]
pipecat-ai[noisereduce]
pipecat-ai[openai]
pipecat-ai[openpipe]
pipecat-ai[silero]
pipecat-ai[simli]
pipecat-ai[soundfile]
pipecat-ai[websocket]
pipecat-ai[whisper]

View File

@@ -0,0 +1,3 @@
# Force specific grpcio version for PlayHT
grpcio>=1.68.0
pipecat-ai[playht]

View File

@@ -0,0 +1,3 @@
# Force specific grpcio version for Riva
grpcio==1.65.4
pipecat-ai[riva]

View File

@@ -1,6 +0,0 @@
sphinx>=8.1.3
sphinx-rtd-theme
sphinx-markdown-builder
sphinx-autodoc-typehints
toml
pipecat-ai[anthropic,assemblyai,aws,azure,canonical,cartesia,deepgram,elevenlabs,fal,fireworks,gladia,google,grok,groq,krisp,langchain,lmnt,moondream,nim,noisereduce,openai,openpipe,playht,silero,soundfile,websocket,whisper]

36
docs/api/rtd-test.sh Executable file
View File

@@ -0,0 +1,36 @@
#!/bin/bash
set -e
# Configuration
DOCS_DIR=$(pwd)
PROJECT_ROOT=$(cd ../../ && pwd)
TEST_DIR="/tmp/rtd-test-$(date +%Y%m%d_%H%M%S)"
echo "Creating test directory: $TEST_DIR"
mkdir -p "$TEST_DIR"
cd "$TEST_DIR"
# Create single virtual environment
python -m venv venv
source venv/bin/activate
echo "Installing base dependencies..."
pip install --upgrade pip wheel setuptools
pip install -r "$DOCS_DIR/requirements-base.txt"
# Try to install optional dependencies, but don't fail if they don't work
echo "Installing Riva dependencies..."
pip install -r "$DOCS_DIR/requirements-riva.txt" || echo "Failed to install Riva dependencies"
echo "Installing PlayHT dependencies..."
pip install -r "$DOCS_DIR/requirements-playht.txt" || echo "Failed to install PlayHT dependencies"
echo "Building documentation..."
cd "$DOCS_DIR"
sphinx-build -b html . "_build/html"
echo "Build complete. Check _build/html directory for output."
# Print installed packages for verification
echo "Installed packages:"
pip freeze

View File

@@ -66,11 +66,11 @@ openpipe = [ "openpipe~=4.38.0" ]
playht = [ "pyht~=0.1.8", "websockets~=13.1" ]
riva = [ "nvidia-riva-client~=2.17.0" ]
silero = [ "onnxruntime~=1.20.1" ]
simli = [ "simli-ai~=0.1.7"]
soundfile = [ "soundfile~=0.12.1" ]
together = [ "openai~=1.57.2" ]
websocket = [ "websockets~=13.1", "fastapi~=0.115.0" ]
whisper = [ "faster-whisper~=1.1.0" ]
simli = [ "simli-ai~=0.1.7"]
[tool.setuptools.packages.find]
# All the following settings are optional:

View File

@@ -11,7 +11,7 @@ import json
import re
from asyncio import CancelledError
from dataclasses import dataclass
from typing import Any, Dict, List, Optional
from typing import Any, Dict, List, Optional, Union
from loguru import logger
from PIL import Image
@@ -75,8 +75,7 @@ class AnthropicContextAggregatorPair:
class AnthropicLLMService(LLMService):
"""
This class implements inference with Anthropic's AI models.
"""This class implements inference with Anthropic's AI models.
Can provide a custom client via the `client` kwarg, allowing you to
use `AsyncAnthropicBedrock` and `AsyncAnthropicVertex` clients
@@ -328,7 +327,7 @@ class AnthropicLLMContext(OpenAILLMContext):
tools: list[dict] | None = None,
tool_choice: dict | None = None,
*,
system: str | NotGiven = NOT_GIVEN,
system: Union[str, NotGiven] = NOT_GIVEN,
):
super().__init__(messages=messages, tools=tools, tool_choice=tool_choice)

View File

@@ -1,3 +1,9 @@
#
# Copyright (c) 2024, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
import asyncio
from typing import AsyncGenerator
@@ -67,8 +73,7 @@ class AssemblyAISTTService(STTService):
await self._disconnect()
async def run_stt(self, audio: bytes) -> AsyncGenerator[Frame, None]:
"""
Process an audio chunk for STT transcription.
"""Process an audio chunk for STT transcription.
This method streams the audio data to AssemblyAI for real-time transcription.
Transcription results are handled asynchronously via callback functions.
@@ -83,8 +88,7 @@ class AssemblyAISTTService(STTService):
yield None
async def _connect(self):
"""
Establish a connection to the AssemblyAI real-time transcription service.
"""Establish a connection to the AssemblyAI real-time transcription service.
This method sets up the necessary callback functions and initializes the
AssemblyAI transcriber.
@@ -95,8 +99,7 @@ class AssemblyAISTTService(STTService):
logger.info(f"{self}: Connected to AssemblyAI")
def on_data(transcript: aai.RealtimeTranscript):
"""
Callback for handling incoming transcription data.
"""Callback for handling incoming transcription data.
This function runs in a separate thread from the main asyncio event loop.
It creates appropriate transcription frames and schedules them to be
@@ -121,8 +124,7 @@ class AssemblyAISTTService(STTService):
asyncio.run_coroutine_threadsafe(self.push_frame(frame), self._loop)
def on_error(error: aai.RealtimeError):
"""
Callback for handling errors from AssemblyAI.
"""Callback for handling errors from AssemblyAI.
Like on_data, this runs in a separate thread and schedules error
handling in the main event loop.

View File

@@ -1,3 +1,9 @@
#
# Copyright (c) 2024, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#
from typing import AsyncGenerator, Optional
import aiohttp