Merge pull request #2365 from pipecat-ai/aleix/update-release-evals-for-new-runner

scripts(evals): update to use new runner function
2025-08-05 13:07:57 -07:00
parent 1138c92a00 5546c8e01c
commit 95c661bdaa
2 changed files with 4 additions and 4 deletions
--- a/scripts/evals/README.md
+++ b/scripts/evals/README.md
@@ -32,7 +32,7 @@ also explains why it thinks the answer is valid or invalid.
 To run the release evals:

 ```sh
-python run-release-evals.py -a -v
+uv run run-release-evals.py -a -v
 ```

 This runs all the evals and stores logs and audio (`-a`) for each test.
@@ -41,7 +41,7 @@ You can also specify which tests to run. For example, to run all `07` series
 tests:

 ```sh
-python run-release-evals.py -p 07 -a -v
+uv run run-release-evals.py -p 07 -a -v
 ```

 ## Script Evals
@@ -49,7 +49,7 @@ python run-release-evals.py -p 07 -a -v
 You can also run evals for a single example (not part of the release set):

 ```sh
-python run-eval.py -p "A simple math addition" -a -v YOUR_EXAMPLE_SCRIPT
+uv run run-eval.py -p "A simple math addition" -a -v YOUR_EXAMPLE_SCRIPT
 ```

 Your script needs to follow any of the foundation examples pattern.
--- a/scripts/evals/eval.py
+++ b/scripts/evals/eval.py
@@ -176,7 +176,7 @@ async def run_example_pipeline(script_path: Path):
        ),
    )

-    await module.run_example(transport, argparse.Namespace(), True)
+    await module.run_bot(transport)


 async def run_eval_pipeline(