Search
Collection
Category
Tag
Blog
Pricing
Submit

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates

Email

AISecKit

Curated AI security tools & LLM safety resources for cybersecurity professionals

Product

Search
Collection
Category
Tag

Resources

Blog
Pricing
Submit

Tools

🔥Marathons Tools

Company

About Us
Privacy Policy
Terms of Service
Sitemap

Copyright © 2025 All Rights Reserved.

Home
Category
LLM-Evaluation

LLM-Evaluation

Sample notebooks and prompts for evaluating large language models (LLMs) and generative AI.

image for LLM-Evaluation

Introduction

Information

Publisher
AISecKit
Websitegithub.com
Published date2025/04/28

Categories

AI Models
AI Application Platforms
AI Conferences & Events

Tags

Prompt Engineering
Open Source
LLM
Generative AI
Model Evaluation

More Products

prompt.fail

Explore prompt injection techniques in large language models (LLMs), providing examples to improve LLM security and robustness.

Prompt Injection Model Robustness Compliance Risk Assessment Security Frameworks+1

Learn Prompt Hacking

The most comprehensive prompt hacking course available, focusing on prompt engineering and security.

Prompt Engineering AI Ethics Generative AI Security Best Practices LLM Security

LangKit

An open-source toolkit for monitoring Large Language Models (LLMs) with features like text quality and sentiment analysis.

Prompt Injection Model Robustness Security Auditing Open Source LLM

LLM Evaluation

The LLM Evaluation repository provides a collection of sample notebooks and prompts designed for evaluating large language models (LLMs) and generative AI systems. This resource is particularly useful for researchers and practitioners looking to understand and assess the performance of LLMs in various contexts.

Key Features:

Sample Notebooks: Includes Jupyter notebooks that demonstrate evaluation techniques and methodologies for LLMs.
Prompts for Evaluation: A curated set of prompts that can be used to test and evaluate the capabilities of LLMs.
Workshop Resources: Contains materials from evaluation workshops, including slides and additional resources for deeper learning.
OpenAI API Integration: Some notebooks require an OpenAI API key, allowing users to leverage powerful AI models for evaluation.

Benefits:

Hands-On Learning: Users can interact with LLMs and learn through practical examples and guided notebooks.
Community Contributions: The repository encourages contributions from the community, fostering collaboration and knowledge sharing.
Regular Updates: The repository is actively maintained, with updates planned for future workshops and resources.

Highlights:

Resources for evaluating LLMs and generative AI.
Links to conference presentations and videos for further learning.
A focus on practical applications and real-world use cases for LLM evaluation.