This project investigates the security of large language models by classifying prompts to discover malicious injections.
Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"
The official implementation of a pre-print paper on prompt injection attacks against large language models.
Uses the ChatGPT model to filter out potentially dangerous user-supplied questions.
A prompt injection game to collect data for robust ML research.
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
Project Mantis is a tool designed to counter LLM-driven cyberattacks using prompt injection techniques.
A steganography tool for encoding images as prompt injections for AIs with vision capabilities.
Custom node for ComfyUI enabling specific prompt injections within Stable Diffusion UNet blocks.
A benchmark for evaluating prompt injection detection systems.