Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
A unified evaluation framework for large language models.

Nano Bananary is an AI batch image and video generator with 142 effects.

AI Podcast Generator for bilingual episodes, supporting multiple languages and alternative to NotebookLLM.

Zero-Config Code Flow for Claude code & Codex, enabling seamless integration and configuration for AI development.
PromptBench is a Pytorch-based Python package designed for the evaluation of Large Language Models (LLMs). It provides user-friendly APIs for researchers to conduct evaluations on LLMs efficiently. Here are some key features and benefits:
For more information, visit the GitHub repository.