A curated list of tools, datasets, demos, and papers for evaluating large language models (LLMs).
The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".
A study evaluating geopolitical and cultural biases in large language models through dual-layered assessments.
A comprehensive survey on benchmarks for Multimodal Large Language Models (MLLMs).
EuroBERT is a multilingual encoder model designed for European languages, trained using the Optimus training library.
中文法律对话语言模型,旨在为法律问题提供专业可靠的回答。
InfiniteYou enables flexible photo recrafting while maintaining identity using advanced Diffusion Transformers.
Automated tool for tracking AI trends through Reddit insights in English and Chinese.
Fully local web research assistant using LLMs for generating queries, summarizing results, and writing reports.
A platform aggregating top programming, AI, product, and tech articles with AI-driven summaries and ratings.