AI agent to conduct vulnerability tests on LLMs from SAP AI Core or local deployments, identifying and correcting security vulnerabilities.
Targeted Adversarial Examples on Speech-to-Text systems.
A PyTorch adversarial library for attack and defense methods on images and graphs.
Advbox is a toolbox for generating adversarial examples to test the robustness of neural networks across various frameworks.
A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX.
A controllable SONAR image generation framework utilizing text-to-image diffusion and GPT prompting for enhanced diversity and realism.
Unofficial implementation of backdooring instruction-tuned LLMs using virtual prompt injection.
Code to generate NeuralExecs for prompt injection attacks tailored for LLMs.
Discover the leaked system instructions and prompts for ChatGPT's custom GPT plugins.
A repository for benchmarking prompt injection attacks against AI models like GPT-4 and Gemini.
Fine-tuning base models to create robust task-specific models for better performance.