Newsletter
Join the Community
Subscribe to our newsletter for the latest news and updates
LLM inference in C/C++ with minimal setup and high performance for various hardware.

Nano Bananary is an AI batch image and video generator with 142 effects.

AI Podcast Generator for bilingual episodes, supporting multiple languages and alternative to NotebookLLM.

Zero-Config Code Flow for Claude code & Codex, enabling seamless integration and configuration for AI development.
Llama.cpp is a project aimed at enabling Large Language Model (LLM) inference in pure C/C++. It provides a comprehensive library and tools for developers to easily integrate and experiment with LLMs, including Meta's LLaMA model and others. The project focuses on minimal setup and state-of-the-art performance across a wide range of hardware, both locally and in the cloud.
llama-cli for easy access to model functionalities, llama-server for serving models via HTTP, and llama-bench for performance benchmarking.