A curated collection of open-source Chinese large language models, focusing on smaller, privatizable, and cost-effective models.
Open-source Chinese LLaMA and Alpaca models for local CPU/GPU training and deployment.
DeepSeek-VL2 is a series of advanced Mixture-of-Experts Vision-Language Models for multimodal understanding.
Streamline the fine-tuning process for multimodal models like PaliGemma 2, Florence-2, and Qwen2.5-VL.
Step-Audio is an open-source framework for intelligent speech interaction, supporting multilingual and emotional speech synthesis.
MedRAX is a versatile AI agent for integrated chest X-ray analysis and medical reasoning.
R1-Onevision is a visual language model capable of deep CoT reasoning.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs with zero-code CLI and Web UI.
EgoLife is an egocentric AI project for capturing and understanding multimodal daily activities using advanced technology.
A Unified Tokenizer for Visual Generation and Understanding.
Official Repo for TheoremExplainAgent, generating explanations for theorems using LLMs and AI video generation.
MAP-NEO is a fully open-sourced Large Language Model with state-of-the-art performance for diverse research applications.