Safety-Prompts Repository
The Safety-Prompts repository provides a set of 100k Chinese safety prompts for evaluating and improving the safety of large language models (LLMs). This dataset includes various safety scenarios and adversarial instruction attacks, aimed at aligning model outputs with human values and enhancing their knowledge regarding safety.
Key Features:
- Diverse Dataset: Includes traditional safety scenarios and instruction attacks.
- Comprehensive Assessment: Facilitates evaluation of LLMs across multiple safety dimensions.
- Model Training Resource: Aids in the fine-tuning of safer language models.
Benefits:
- Improve the reliability and safety of Chinese LLM outputs.
- Enhance alignment of model behaviors with societal norms and regulations.
- Supports researchers and developers in building safer AI applications.
Highlights:
- Access to various prompts for diverse coverage of safety issues
- Includes links to additional resources and related research publications.
- Encourages feedback and collaboration within the community for ongoing improvements.




