PINT Benchmark
The PINT Benchmark is designed to evaluate the performance of prompt injection detection systems, such as Lakera Guard, without relying on known public datasets. This ensures that the evaluation is unbiased and accurate. Here are some key features and benefits:
Key Features:
- Comprehensive Dataset: The PINT dataset includes 4,314 inputs, with a mix of English and non-English data, ensuring a robust evaluation.
- Neutral Evaluation: All evaluated solutions are not trained on the dataset, providing a fair comparison.
- Custom Dataset Support: Users can benchmark their own datasets by formatting them as YAML files or using pandas DataFrames.
- Multiple Categories: The benchmark supports various categories of prompt injections, including public datasets and proprietary data.
Benefits:
- Improved Security: By evaluating prompt injection detection systems, the PINT Benchmark helps enhance the security of generative AI systems.
- Community Contributions: The project welcomes contributions from all parties to improve the benchmark and its methodologies.
- User-Friendly: The benchmark can be easily run using a Jupyter Notebook, making it accessible for developers and researchers.
Highlights:
- Continuous improvements to the dataset to maintain its robustness.
- Examples provided for evaluating various prompt injection detection models.
- Open to feedback and collaboration to enhance the benchmark's effectiveness.
