Search
Collection
Category
Tag
Blog
Pricing
Submit

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates

Email

AISecKit

Curated AI security tools & LLM safety resources for cybersecurity professionals

Product

Search
Collection
Category
Tag

Resources

Blog
Pricing
Submit

Tools

🔥Marathons Tools

Company

About Us
Privacy Policy
Terms of Service
Sitemap

Copyright © 2025 All Rights Reserved.

Home
Category
SafetyBench

SafetyBench

Official GitHub repository for SafetyBench, a benchmark to evaluate the safety of large language models (LLMs).

image for SafetyBench

Introduction

Information

Publisher
AISecKit
Websitegithub.com
Published date2025/04/28

Categories

AI Ethics Resources
AI Research Papers

Tags

Safety Alignments
Open Source

More Products

image of agentic-design-patterns-cn

AI Application PlatformsAI Research PapersAI Development Frameworks

agentic-design-patterns-cn

A bilingual Chinese-English translation of 'Agentic Design Patterns' by Antonio Gulli, focusing on intelligent systems design.

AI Reasoning Open Source AI Education AI Standards AI Communities+1

image of TradingAgents-CN

AI Application PlatformsAI Research PapersAI Development Frameworks

TradingAgents-CN

基于多智能体LLM的中文金融交易框架，支持A股/港股/美股分析。

Market Analysis Open Source LLM AI Communities Generative AI+1

LangFair

LangFair is a Python library for conducting use-case level LLM bias and fairness assessments.

Responsible AI LLM Bias Mitigation

SafetyBench

SafetyBench is a comprehensive benchmark designed specifically for evaluating the safety of large language models (LLMs). This GitHub repository contains resources, data sets, and guidelines for conducting safety evaluations. The benchmark includes over 11,000 diverse multiple-choice questions categorized into various safety concerns, providing researchers with a robust framework to assess LLM performance in terms of safety.

Key Features:

Extensive Dataset: Contains 11,435 diverse questions in both English and Chinese, covering six categories of safety.
Evaluation Framework: Offers detailed instructions on how to evaluate and submit results using the benchmark.
Submissions and Leaderboards: Users can submit their findings for LLM safety assessments and view current leaderboards for performance comparisons.
Integration: SafetyBench is integrated into SuperBench, facilitating comparative evaluations among different LLMs.

Benefits:

Promotes understanding of LLM safety capabilities.
Encourages responsible deployment by identifying safety flaws in AI models.
Provides an open-source platform for continuous improvement in LLM safety evaluations.

Highlights:

Accepted at ACL 2024 conference, reinforcing its academic credibility.
Accessible data and code for easy adoption in research and development.