Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
Official GitHub repository for SafetyBench, a benchmark to evaluate the safety of large language models (LLMs).
An AI-driven daily arXiv paper crawler, analyzer, and organizer tool, focusing on AIGC.
SafetyBench is a comprehensive benchmark designed specifically for evaluating the safety of large language models (LLMs). This GitHub repository contains resources, data sets, and guidelines for conducting safety evaluations. The benchmark includes over 11,000 diverse multiple-choice questions categorized into various safety concerns, providing researchers with a robust framework to assess LLM performance in terms of safety.