A comprehensive open-source tutorial on large-scale pre-trained language models covering theory and practical applications.
A research project assessing and aligning the values of Chinese large language models focusing on safety and responsibility.
This repository contains the code for generating the ToxiGen dataset for hate speech detection.
A framework for optimizing prompts with a self-evolving mechanism for better task performance.
A study evaluating geopolitical and cultural biases in large language models through dual-layered assessments.
A guidebook sharing insights and knowledge about evaluating Large Language Models (LLMs).
one-click face swap tool for replacing faces in videos with ease, requires technical skills for installation.
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
Protect AI focuses on securing machine learning and AI applications with various open-source tools.
Dataset for classifying prompts as jailbreak or benign to enhance LLM safety.
A dataset of 15,140 ChatGPT prompts, including 1,405 jailbreak prompts, collected from various platforms for research purposes.