
Chinese safety prompts for evaluating and improving the safety of LLMs.

A research project assessing and aligning the values of Chinese large language models focusing on safety and responsibility.

Official GitHub repository for SafetyBench, a benchmark to evaluate the safety of large language models (LLMs).

The official implementation of InjecGuard, a tool for benchmarking and mitigating over-defense in prompt injection guardrail models.