Unofficial implementation of backdooring instruction-tuned LLMs using virtual prompt injection.
Protect AI focuses on securing machine learning and AI applications with various open-source tools.
This project investigates the security of large language models by classifying input prompts to discover malicious ones.
Save your precious prompt from leaking with minimal cost.
A GitHub repository for developing adversarial attack techniques using injection prompts.
A collection of state-of-the-art jailbreak methods for LLMs, including papers, codes, datasets, and analyses.