
A curated list of tools, datasets, demos, and papers for evaluating large language models (LLMs).

Sample notebooks and prompts for evaluating large language models (LLMs) and generative AI.

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

Open-source evaluation toolkit for large multi-modality models, supporting 220+ models and 80+ benchmarks.

Automatable GenAI Scripting for programmatically assembling prompts for LLMs using JavaScript.

A powerful framework for building realtime voice AI agents.

Real-time face swap and one-click video deepfake with only a single image.

Fuse ChatTTS with OpenVoice to clone your personalized voice from a 10-second audio clip upload.

Real-time voice interactive digital human supporting customizable appearance and voice with low latency.

Large Language Model in Action is a GitHub repository demonstrating various implementations and applications of large language models.

RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO & designed for fine-tuning.