Tag
Explore by tags

EuroBERT
EuroBERT is a multilingual encoder model designed for European languages, trained using the Optimus training library.

llmkit
A prompt management, versioning, testing, and evaluation inference server and UI toolkit, provider agnostic and OpenAI API compatible.

RagaAI Catalyst
Python SDK for Agent AI Observability, Monitoring and Evaluation Framework.

LLMBox
A comprehensive library for implementing LLMs with a unified training pipeline and model evaluation.

LLM Evaluation Guidebook
A guidebook sharing insights and knowledge about evaluating Large Language Models (LLMs).

Awesome LLMs Evaluation Papers
A comprehensive collection of papers focused on evaluating large language models (LLMs).

LLM AutoEval
Automatically evaluate your LLMs in Google Colab with LLM AutoEval.

EvalScope
A customizable framework for efficient large model evaluation and performance benchmarking.

nano-aha-moment
Efficient full parameter tuning library for reinforcement learning applications in LLMs.

GreatLibrarianFrontend
Large model test toolkit front-end framework for efficient testing and collaborative evaluation of large language models.

GreatLibrarian
Scenario-based large model testing toolbox for automating evaluations of large language models.
