MiniMind Project

"The Greatest Path is the Simplest"

The MiniMind series is extremely lightweight, with the smallest version being 1/7000 the size of GPT-3, making it possible to train quickly on even the most ordinary personal GPUs.
The project also open-sources the minimalist structure of the large model, including extensions for shared mixed experts (MoE), dataset cleaning, pretraining, supervised fine-tuning (SFT), LoRA fine-tuning, direct preference optimization (DPO) algorithms, and model distillation algorithms, along with the full code of the process.
MiniMind also expands into vision multimodal VLM: MiniMind-V.
All core algorithm code is reconstructed from scratch using native PyTorch! It does not rely on abstract interfaces provided by third-party libraries.
This is not only a full-stage open-source reproduction of a large language model but also a tutorial for beginners in LLM.
We hope this project will serve as an inspiring example for everyone, helping to enjoy the fun of creation and promoting the progress of the wider AI community!
To avoid misunderstanding, the "2 hours" test is based on NVIDIA 3090 hardware (single GPU), and the "3 RMB" refers to the GPU server rental cost. Details of the specifications can be found below.