Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
An open-source LLM evaluation framework for testing and evaluating large language model outputs.
DeepEval is a simple-to-use, open-source LLM evaluation framework designed to test and evaluate large language models (LLMs) outputs. It aims to be a specialized unit testing tool similar to Pytest but tailored for LLM applications.
DeepEval equips developers and researchers alike with powerful tools to ensure their LLM systems meet high standards of performance and relevance.