DistillFlow
DistillFlow is an open-source toolkit designed to simplify and scale the distillation of large language models (LLMs) into smaller, more efficient models. It provides a flexible pipeline for distillation, fine-tuning, and experimentation across multiple GPUs, with support for dynamic resource allocation and easy integration of custom techniques.
Key Features
- Multi-Strategy Distillation: Supports various distillation techniques such as logits, attention, and layers based distillation.
- Dynamic Resource Allocation: Automatically distributes tasks across GPUs or nodes based on available memory.
- Fine-Tuning Support: Allows for domain-specific and downstream fine-tuning of distilled models.
- Model Loading Optimizations: Supports optimized model loading using Unsloth, Liger Kernel, Flash Attention, etc.
- Easy Integration: Compatible with popular libraries like Hugging Face Transformers, PyTorch, and DeepSpeed.
Benefits
- Simplifies the process of model distillation, making it accessible for developers and researchers.
- Enhances the efficiency of deploying machine learning models by reducing their size without significant loss of performance.
- Facilitates experimentation with different models and datasets, promoting innovation in AI development.
Highlights
- Supports any HuggingFace dataset in ShareGPT or Alpaca formats.
- Provides a fully configurable pipeline for distillation, allowing users to specify teacher and student models, datasets, and distillation types.