Chinese safety prompts for evaluating and improving the safety of LLMs.
A research project assessing and aligning the values of Chinese large language models focusing on safety and responsibility.
Official GitHub repository for SafetyBench, a benchmark to evaluate the safety of large language models (LLMs).
The official implementation of InjecGuard, a tool for benchmarking and mitigating over-defense in prompt injection guardrail models.