Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
Official GitHub repository for SafetyBench, a benchmark to evaluate the safety of large language models (LLMs).

A bilingual Chinese-English translation of 'Agentic Design Patterns' by Antonio Gulli, focusing on intelligent systems design.

基于多智能体LLM的中文金融交易框架,支持A股/港股/美股分析。
SafetyBench is a comprehensive benchmark designed specifically for evaluating the safety of large language models (LLMs). This GitHub repository contains resources, data sets, and guidelines for conducting safety evaluations. The benchmark includes over 11,000 diverse multiple-choice questions categorized into various safety concerns, providing researchers with a robust framework to assess LLM performance in terms of safety.