Evalchemy

Efficiency: Dramatically reduce wall clock time for large benchmarks through parallel processing.
Flexibility: Supports a wide range of benchmarks and model types, making it versatile for different evaluation needs.
Cost-Effective: Provides insights into runtime and cost analysis, helping users optimize their evaluation processes.

A unified toolkit for automatic evaluations of large language models (LLMs).