LogoAISecKit
  • Search
  • Collection
  • Category
  • Tag
  • Blog
  • Pricing
  • Submit
LogoAISecKit

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates

LogoAISecKit

Curated directory of 1700+ AI tools, models, frameworks, MCP servers, and cybersecurity resources

GitHub
Product
  • Search
  • Collection
  • Category
  • Tag
Resources
  • Blog
  • Pricing
  • Submit
Company
  • About Us
  • Privacy Policy
  • Terms of Service
  • Sitemap
Copyright © 2026 All Rights Reserved.
Sponsored Resources
  1. Home
  2. Category
  3. SafetyBench
icon of SafetyBench

SafetyBench

Official GitHub repository for SafetyBench, a benchmark to evaluate the safety of large language models (LLMs).

Visit Website
image for SafetyBench
Visit Website

Introduction

SafetyBench

SafetyBench is a comprehensive benchmark designed specifically for evaluating the safety of large language models (LLMs). This GitHub repository contains resources, data sets, and guidelines for conducting safety evaluations. The benchmark includes over 11,000 diverse multiple-choice questions categorized into various safety concerns, providing researchers with a robust framework to assess LLM performance in terms of safety.

Key Features:
  • Extensive Dataset: Contains 11,435 diverse questions in both English and Chinese, covering six categories of safety.
  • Evaluation Framework: Offers detailed instructions on how to evaluate and submit results using the benchmark.
  • Submissions and Leaderboards: Users can submit their findings for LLM safety assessments and view current leaderboards for performance comparisons.
  • Integration: SafetyBench is integrated into SuperBench, facilitating comparative evaluations among different LLMs.
Benefits:
  • Promotes understanding of LLM safety capabilities.
  • Encourages responsible deployment by identifying safety flaws in AI models.
  • Provides an open-source platform for continuous improvement in LLM safety evaluations.
Highlights:
  • Accepted at ACL 2024 conference, reinforcing its academic credibility.
  • Accessible data and code for easy adoption in research and development.
Back

Information

  • Publisher
    AISecKit
  • Websitegithub.com
  • Published date2025/04/28

Categories

  • AI Ethics Resources
  • AI Research Papers

Tags

  • Safety Alignments
  • Open Source

More Products

image of agentic-design-patterns-cn
AI Application PlatformsAI Research PapersAI Development Frameworks
Visit Website
icon of agentic-design-patterns-cn

agentic-design-patterns-cn

A bilingual Chinese-English translation of 'Agentic Design Patterns' by Antonio Gulli, focusing on intelligent systems design.

AI ReasoningOpen SourceAI EducationAI StandardsAI Communities+1
image of TradingAgents-CN
AI Application PlatformsAI Research PapersAI Development Frameworks
Visit Website
icon of TradingAgents-CN

TradingAgents-CN

基于多智能体LLM的中文金融交易框架,支持A股/港股/美股分析。

Market AnalysisOpen SourceLLMAI CommunitiesGenerative AI+1
L
AI ModelsAI Application PlatformsAI Ethics Resources
Visit Website
icon of LangFair

LangFair

LangFair is a Python library for conducting use-case level LLM bias and fairness assessments.

Responsible AILLMBias Mitigation