Tag
Explore by tags

Awesome-LLM-Eval
A curated list of tools, datasets, demos, and papers for evaluating large language models (LLMs).

LLM-eval-survey
The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

arxiv-mcp-server
A Model Context Protocol server for searching and analyzing arXiv papers.

Local Deep Researcher
Fully local web research assistant using LLMs for generating queries, summarizing results, and writing reports.

Awesome LLMs Evaluation Papers
A comprehensive collection of papers focused on evaluating large language models (LLMs).

Gallia
Gallia is an extendable pentesting framework focusing on automotive penetration testing.

DeepGit
DeepGit is an advanced research agent designed to help users find the best GitHub repositories.

mad-professor
An AI companion that enhances paper reading with interactive features and a quirky AI professor persona.

Universal-Prompt-Injection
The official implementation of a pre-print paper on prompt injection attacks against large language models.