Enhanced edition of GeoIP files for various formats including V2Ray, Clash, and Surge.
Expose your FastAPI endpoints as Model Context Protocol (MCP) tools, with Auth!
AutoAudit is a large language model (LLM) designed for enhancing cybersecurity through advanced AI-driven threat detection and response.
A curated list of tools, datasets, demos, and papers for evaluating large language models (LLMs).
Sample notebooks and prompts for evaluating large language models (LLMs) and generative AI.
The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
Open-source framework for evaluating and testing AI and LLM systems for performance, bias, and security issues.
Phoenix is an open-source AI observability platform for experimentation, evaluation, and troubleshooting.