
A comprehensive library for implementing LLMs with a unified training pipeline and model evaluation.

Automatically evaluate your LLMs in Google Colab with LLM AutoEval.

Self-evaluating interview for AI coders.

A customizable framework for efficient large model evaluation and performance benchmarking.

A collection of benchmarks and datasets for evaluating large language models (LLMs).

VideoMind is a Chain-of-LoRA Agent designed for long video reasoning using human-like processes.

AI模型接口管理与分发系统,支持多种大模型统一调用,并提供企业和个人使用的分发管理服务。

Code for Segment Any Motion in Videos, enabling motion segmentation in video sequences.

A Python-based automated testing framework for evaluating the performance and inference capabilities of large language models.

KIMI AI is a long-text model reverse API supporting high-speed streaming output, intelligent dialogue, and document interpretation.

Run LLMs with MLX, a Python package for generating text and fine-tuning large language models on Apple silicon.

Official code for the CVPR25 oral paper on biomechanically accurate human reconstruction.