LogoAISecKit
  • Search
  • Collection
  • Category
  • Tag
  • Blog
  • Pricing
  • Submit
LogoAISecKit

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates

LogoAISecKit

Curated directory of 1700+ AI tools, models, frameworks, MCP servers, and cybersecurity resources

GitHub
Product
  • Search
  • Collection
  • Category
  • Tag
Resources
  • Blog
  • Pricing
  • Submit
Company
  • About Us
  • Privacy Policy
  • Terms of Service
  • Sitemap
Copyright © 2026 All Rights Reserved.
Sponsored Resources
  1. Home
  2. Category
  3. InjecGuard
icon of InjecGuard

InjecGuard

The official implementation of InjecGuard, a tool for benchmarking and mitigating over-defense in prompt injection guardrail models.

Visit Website
image for InjecGuard
Visit Website

Introduction

InjecGuard

InjecGuard is the first prompt guard model against prompt injection attacks, designed to benchmark and mitigate over-defense issues prevalent in existing models. This repository not only contains the official code implementations but also incorporates various datasets that facilitate thorough evaluations of guardrail models.

Key Features:
  • Innovative Model: InjecGuard tackles the common challenge of over-defense in prompt guard models which falsely classify benign inputs as malicious.
  • NotInject Dataset: A specialized evaluation dataset created to assess the extent of over-defense, helping to improve model accuracy.
  • Open Source: Complete access to model weights and training strategies allows the community to contribute and enhance robustness.
  • Pre-trained Checkpoints: Quickly deploy models using Hugging Face Transformers to facilitate seamless integration into existing workflows.
Benefits:
  • Robust Defense: Achieves state-of-the-art performance in the field, significantly reducing trigger word bias.
  • Comprehensive Evaluations: Includes mechanisms for testing against various datasets ensuring reliable model performance across different conditions.
Highlights:
  • Released on Hugging Face for easy deployment.
  • Extensive documentation and guidelines for usage, training, and evaluation.
  • Effective in real-world applications of AI language models against prompt injection risks.
Back

Information

  • Publisher
    AISecKit
  • Websitegithub.com
  • Published date2025/04/27

Categories

  • AI Research Papers
  • Model Robustness Enhancement
  • Prompt Injection Defense

Tags

  • Prompt Injection
  • Model Robustness
  • Compliance
  • Safety Alignments
  • Security Auditing
  • Open Source
  • Adversarial Examples

More Products

image of agentic-design-patterns-cn
AI Application PlatformsAI Research PapersAI Development Frameworks
Visit Website
icon of agentic-design-patterns-cn

agentic-design-patterns-cn

A bilingual Chinese-English translation of 'Agentic Design Patterns' by Antonio Gulli, focusing on intelligent systems design.

AI ReasoningOpen SourceAI EducationAI StandardsAI Communities+1
image of TradingAgents-CN
AI Application PlatformsAI Research PapersAI Development Frameworks
Visit Website
icon of TradingAgents-CN

TradingAgents-CN

基于多智能体LLM的中文金融交易框架,支持A股/港股/美股分析。

Market AnalysisOpen SourceLLMAI CommunitiesGenerative AI+1
P
Prompt Injection Defense
Visit Website
icon of prmptinj

prmptinj

Curated + custom prompt injections for AI models, focusing on security and exploit development.

AI EthicsPrompt InjectionComplianceExploit DevelopmentVulnerability Disclosure