MiniMind: Train a 26M-Parameter GPT from Scratch in Just 2 Hours!
MiniMind is an open-source project aimed at lowering the barrier to entry for training large language models (LLMs). With just a minimal cost of around 3 RMB and a training time of 2 hours on a single NVIDIA 3090 GPU, users can train a lightweight 26M-parameter GPT model from scratch.
Key Features:
- Lightweight Model: The smallest version of MiniMind is only 25.8M parameters, making it accessible for personal GPUs.
- Comprehensive Training Process: The project includes detailed code for pre-training, supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF), and model distillation.
- Open Source: All core algorithms are implemented from scratch using PyTorch, without relying on third-party libraries.
- Multi-Modal Capabilities: MiniMind has been extended to support visual multi-modal tasks with MiniMind-V.
- User-Friendly: The project serves as a tutorial for beginners in LLMs, providing a hands-on experience in training and understanding the underlying mechanisms of large models.
Benefits:
- Cost-Effective: Users can experience the entire process of building a language model for less than 3 RMB.
- Educational Resource: Ideal for those looking to learn about LLMs and their training processes.
- Community Driven: Encourages contributions and improvements from the community, fostering a collaborative environment for AI development.
Highlights:
- Supports single and multi-GPU training.
- Compatible with popular frameworks like transformers and trl.
- Provides a simple API for integration with third-party applications.
Join the MiniMind community and start your journey in AI model training today!



