Search
Collection
Category
Tag
Blog
Pricing
Submit

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates

Email

AISecKit

Curated AI security tools & LLM safety resources for cybersecurity professionals

Product

Search
Collection
Category
Tag

Resources

Blog
Pricing
Submit

Tools

🔥Marathons Tools

Company

About Us
Privacy Policy
Terms of Service
Sitemap

Copyright © 2026 All Rights Reserved.

Home
Category
LLM-eval-survey

LLM-eval-survey

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

image for LLM-eval-survey

Introduction

Information

Publisher
AISecKit
Websitegithub.com
Published date2025/04/28

Categories

AI Application Platforms
AI Ethics Resources
AI Research Papers

Tags

AI Ethics
Open Source
LLM
Research Papers
Generative AI
Model Evaluation
Bias Mitigation

More Products

image of Nano Bananary

AI ModelsAI Application PlatformsAI Video Tools

Nano Bananary

Nano Bananary is an AI batch image and video generator with 142 effects.

Text-to-Video Generative AI

image of Twocast

AI Application PlatformsAI Productivity ToolsAI Audio Tools

Twocast

AI Podcast Generator for bilingual episodes, supporting multiple languages and alternative to NotebookLLM.

Content Creation

image of ZCF

AI Application PlatformsAI Productivity ToolsAI Development Frameworks

ZCF

Zero-Config Code Flow for Claude code & Codex, enabling seamless integration and configuration for AI development.

Open Source Claude

LLM-eval-survey

The LLM-eval-survey is the official GitHub repository for the survey paper titled "A Survey on Evaluation of Large Language Models". This repository serves as a comprehensive resource for evaluating large language models (LLMs) across various tasks and domains.

Key Features:

Comprehensive Evaluation: The repository organizes papers and resources related to the evaluation of LLMs, covering a wide range of tasks including natural language processing, reasoning, and more.
Contribution Welcome: Users are encouraged to contribute by suggesting new benchmarks or improvements, with proper acknowledgment in the paper.
Latest Updates: The repository is regularly updated with the latest research and findings in the field of LLM evaluation.

Benefits:

Research Resource: A valuable resource for researchers and practitioners in the field of AI and natural language processing.
Community Engagement: Encourages collaboration and community input to enhance the quality and comprehensiveness of the survey.

Highlights:

Organized papers according to various evaluation criteria such as robustness, ethics, biases, and trustworthiness.
Includes related projects and benchmarks for better summarization and evaluation of LLMs.

For more information, visit the GitHub repository.