LogoAISecKit
  • Search
  • Collection
  • Category
  • Tag
  • Blog
  • Pricing
  • Submit
LogoAISecKit

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates

LogoAISecKit

Curated directory of 1700+ AI tools, models, frameworks, MCP servers, and cybersecurity resources

GitHub
Product
  • Search
  • Collection
  • Category
  • Tag
Resources
  • Blog
  • Pricing
  • Submit
Company
  • About Us
  • Privacy Policy
  • Terms of Service
  • Sitemap
Copyright © 2026 All Rights Reserved.
Sponsored Resources
  1. Home
  2. Category
  3. llm-security-prompt-injection
icon of llm-security-prompt-injection

llm-security-prompt-injection

This project investigates the security of large language models by classifying prompts to discover malicious injections.

Visit Website
image for llm-security-prompt-injection
Visit Website

Introduction

Detailed Introduction

This GitHub project focuses on investigating the security of large language models (LLMs) with a primary emphasis on prompt injection attacks. The study involves:

  • Binary Classification: Performing binary classification on a dataset of input prompts to identify malicious prompts that can manipulate LLM behavior.
  • Methodology: Different approaches are analyzed, including:
    • Classical Machine Learning algorithms (Naive Bayes, Logistic Regression, Support Vector Machine, Random Forest)
    • A pre-trained LLM model (XLM-RoBERTa) without fine-tuning
    • A fine-tuned LLM model (XLM-RoBERTa with training on the dataset)
  • Dataset: Utilizes the deepset Prompt Injection Dataset, comprising hundreds of samples in English and other languages, pre-split into training and testing subsets.
  • Results and Analysis: The performance of different classification methods is compared, providing insights into detection capabilities and model accuracy.
Key Features:
  • Prompt Injection Detection: Specialized in identifying malicious input prompts targeting LLMs.
  • Robust Methodologies: Employs various state-of-the-art ML techniques and frameworks to improve detection accuracy.
  • Comprehensive Dataset: Leverages a rich dataset from deepset to ensure robust training and testing of models.
Benefits:
  • Enhances understanding of security issues pertaining to LLMs.
  • Provides tools and methodologies to improve prompt security in AI applications.
  • Aims to contribute valuable findings to the field of AI security research.
Back

Information

  • Publisher
    AISecKit
  • Websitegithub.com
  • Published date2025/04/27

Categories

  • Security Research
  • AI Security Monitoring
  • Prompt Injection Defense

Tags

  • Prompt Injection
  • Model Robustness
  • Risk Assessment
  • LLM
  • Adversarial Examples

More Products

P
Prompt Injection Defense
Visit Website
icon of prmptinj

prmptinj

Curated + custom prompt injections for AI models, focusing on security and exploit development.

AI EthicsPrompt InjectionComplianceExploit DevelopmentVulnerability Disclosure
P
AI ModelsAI Security MonitoringPrompt Injection Defense
Visit Website
icon of prompt.fail

prompt.fail

Explore prompt injection techniques in large language models (LLMs), providing examples to improve LLM security and robustness.

Prompt InjectionModel RobustnessComplianceRisk AssessmentSecurity Frameworks+1
E
Penetration TestingSecurity Training PlatformsAI Security Monitoring
Visit Website
icon of Exploiting AI

Exploiting AI

An introductory class on understanding AI security risks and mitigation strategies.

Prompt InjectionGenerative AIRed Team TestingData Poisoning