Category
Explore by categories

adversarial-reinforcement-learning
Curated reading list for adversarial perspective and robustness in deep reinforcement learning.

Counterfit
A CLI that provides a generic automation layer for assessing the security of ML models.

DeepRobust
A PyTorch adversarial library for attack and defense methods on images and graphs.

AdvBox
Advbox is a toolbox for generating adversarial examples to test the robustness of neural networks across various frameworks.

advertorch
A Python toolbox for adversarial robustness research, implemented in PyTorch.

Adversarial Robustness Toolbox
A Python library designed to enhance machine learning security against adversarial threats.

Foolbox
A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX.

CleverHans
An adversarial example library for constructing attacks, building defenses, and benchmarking both.

prompt_injection_research
This research proposes defense strategies against prompt injection in large language models to improve their robustness and security against unwanted outputs.

AIAnytime/Prompt-Injection-Prevention
GitHub repository for techniques to prevent prompt injection in AI chatbots using LLMs.

InjecGuard
The official implementation of InjecGuard, a tool for benchmarking and mitigating over-defense in prompt injection guardrail models.

SecAlign
Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"