JailBench: A Comprehensive Chinese Security Assessment Benchmark for Large Language Models
JailBench is a large-scale dataset designed to evaluate the jailbreak attack risks of large language models in the Chinese context. It is aligned with the national cybersecurity standards and aims to provide a thorough assessment of the security vulnerabilities in AI-generated content.
Key Features:
- Extensive Dataset: Contains 10,800 test questions specifically designed to evaluate the jailbreak capabilities of large language models.
- Multi-Domain Coverage: The dataset encompasses five primary domains and 40 subdomains, ensuring comprehensive evaluation across various fields.
- Security Assessment: Provides a robust framework for assessing the security performance of large language models against jailbreak attacks.
- Research Contribution: Open-source access to the dataset encourages further research and development in AI security.
Benefits:
- Enhanced Security Testing: Facilitates the identification and mitigation of potential security risks in AI models.
- Standardized Evaluation: Aligns with national standards for cybersecurity, ensuring relevance and applicability in real-world scenarios.
- Community Collaboration: Encourages contributions and feedback from the research community to improve the dataset and its applications.
Highlights:
- Recognized at PAKDD 2025, showcasing its significance in the field of AI security research.
- Continuous updates and improvements based on community feedback and advancements in AI technology.