pytest-evals

Easy Integration: Works seamlessly with pytest, Jupyter notebooks, and CI/CD pipelines.
Parallel Testing: Supports running tests in parallel using pytest-xdist, enhancing efficiency.
Comprehensive Metrics: Collects and analyzes performance metrics to track LLM accuracy.
User-Friendly: Minimalistic design that focuses on logic rather than complex frameworks.

A pytest plugin for running and analyzing LLM evaluation tests.