Introduction
Guidance is an efficient programming paradigm for steering language models. It allows users to control output structure and quality while minimizing latency and costs compared to traditional prompting. Key features include:
- Pure Python Programming: Write pure Python code with additional language model functionalities.
- Constrained Generation: Utilize selects, regex, and context-free grammars to define output constraints.
- Tool Integration: Easily call and manage tools within the generation process, with automatic control interleaving.
- Cross-Backend Compatibility: Execute a single Guidance program on multiple backends such as Transformers, Llama.cpp, and OpenAI.
- High Efficiency: Integrated stateful control enhances speed and reduces overhead through automatic batching of non-generated text.
- Token Healing: Automatically handles token boundary issues for smoother prompt generation.
- Rich Templates: Leverage f-strings for rich message formatting.
- Multi-modal Support: Supports a variety of input types, making it versatile for diverse applications.
With Guidance, developers can craft complex interactions with language models with greater precision and ease. It's tailored for rapid deployment across various AI platforms, making it an essential tool for AI developers aiming for efficiency and control in language model outputs.