LogoAISecKit
icon of creek

creek

生成模型 tokenizer训练,模型初始化,模型预训练,指令微调。llama,creek

Introduction

Creek

Creek is a GitHub repository focused on training tokenizers for generative models, model initialization, pre-training, and instruction fine-tuning. It supports models like Llama and provides scripts for various stages of model training.

Key Features
  • Tokenizer Training: Efficiently train tokenizers for generative models.
  • Model Initialization: Initialize models with ease using provided scripts.
  • Pre-training: Pre-train models with customizable parameters to suit different hardware capabilities.
  • Instruction Fine-tuning: Fine-tune models based on specific instructions to enhance performance.
Benefits
  • Open Source: Contribute and collaborate with the community.
  • Comprehensive Documentation: Detailed instructions and scripts for each process.
  • Scalable: Adaptable to various hardware configurations, ensuring accessibility for different users.
Highlights
  • Supports advanced models like Llama.
  • Provides scripts for training, evaluation, and fine-tuning.
  • Active community contributions and updates.

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates