Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
Unofficial implementation of backdooring instruction-tuned LLMs using virtual prompt injection.
The repository implements the concept of Virtual Prompt Injection (VPI), a technique for executing backdoor attacks on instruction-tuned large language models (LLMs). Proposed in the paper "Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection", VPI allows attackers to manipulate LLM behavior without altering model input during inference.