Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
A collection of benchmarks and datasets for evaluating large language models (LLMs).
The llm_benchmarks repository is a comprehensive collection of benchmarks and datasets designed to evaluate various capabilities of Large Language Models (LLMs). It includes numerous tasks ranging across different domains, including general knowledge, reasoning, summarization, and coding capabilities.