LogoAISecKit
  • Search
  • Collection
  • Category
  • Tag
  • Blog
  • Pricing
  • Submit
LogoAISecKit

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates

LogoAISecKit

Curated directory of 1700+ AI tools, models, frameworks, MCP servers, and cybersecurity resources

GitHub
Product
  • Search
  • Collection
  • Category
  • Tag
Resources
  • Blog
  • Pricing
  • Submit
Company
  • About Us
  • Privacy Policy
  • Terms of Service
  • Sitemap
Copyright © 2026 All Rights Reserved.
Sponsored Resources
  1. Home
  2. Category
  3. jackhhao/jailbreak-classification
icon of jackhhao/jailbreak-classification

jackhhao/jailbreak-classification

Dataset for classifying prompts as jailbreak or benign to enhance LLM safety.

Visit Website
image for jackhhao/jailbreak-classification
Visit Website

Introduction

Jailbreak Classification Dataset

The jackhhao/jailbreak-classification dataset is designed to classify prompts as either jailbreak or benign. This dataset is crucial for enhancing the safety of large language models (LLMs) by helping to detect and prevent harmful jailbreak prompts that users might employ when interacting with these models.

Key Features:
  • Classification Labels: Each prompt is labeled as either 'jailbreak' or 'benign', providing clear categorization for model training.
  • Source Data: The dataset includes prompts sourced from various repositories, ensuring a diverse range of examples for effective training.
  • Model Training: Several models have been trained or fine-tuned using this dataset, enhancing their ability to recognize and respond to potentially harmful prompts.
Benefits:
  • Improved Safety: By classifying prompts, the dataset aids in the development of more secure LLMs, reducing the risk of exploitation through jailbreak prompts.
  • Open Source Contribution: The dataset is part of the open-source movement, promoting transparency and collaboration in AI development.
Highlights:
  • Curation Rationale: The dataset was created with the intent to advance AI safety and ethics, making it a valuable resource for researchers and developers in the field.
Back

Information

  • Publisher
    AISecKit
  • Websitehuggingface.co
  • Published date2025/04/26

Categories

  • AI Models
  • AI Application Platforms
  • Jailbreak Prevention

Tags

  • AI Ethics
  • Prompt Injection
  • Model Robustness
  • Jailbreak Detection
  • Security Auditing
  • Responsible AI

More Products

image of Nano Bananary
AI ModelsAI Application PlatformsAI Video Tools
Visit Website
icon of Nano Bananary

Nano Bananary

Nano Bananary is an AI batch image and video generator with 142 effects.

Text-to-VideoGenerative AI
image of Twocast
AI Application PlatformsAI Productivity ToolsAI Audio Tools
Visit Website
icon of Twocast

Twocast

AI Podcast Generator for bilingual episodes, supporting multiple languages and alternative to NotebookLLM.

Content Creation
image of ZCF
AI Application PlatformsAI Productivity ToolsAI Development Frameworks
Visit Website
icon of ZCF

ZCF

Zero-Config Code Flow for Claude code & Codex, enabling seamless integration and configuration for AI development.

Open SourceClaude