- The MiniMind series is extremely lightweight, with the smallest version being 1/7000 the
size of GPT-3, making it possible to train quickly on even the most ordinary personal GPUs.
- The project also open-sources the minimalist structure of the large model, including extensions for shared
mixed experts (MoE), dataset cleaning, pretraining, supervised fine-tuning (SFT), LoRA fine-tuning, direct
preference optimization (DPO) algorithms, and model distillation algorithms, along with the full code of the
process.
- MiniMind also expands into vision multimodal VLM: MiniMind-V.
- All core algorithm code is reconstructed from scratch using native PyTorch! It does not rely on abstract
interfaces provided by third-party libraries.
- This is not only a full-stage open-source reproduction of a large language model but also a tutorial for
beginners in LLM.
- We hope this project will serve as an inspiring example for everyone, helping to enjoy the fun of creation
and promoting the progress of the wider AI community!
- To avoid misunderstanding, the "2 hours" test is based on NVIDIA 3090 hardware (single
GPU), and the "3
RMB" refers to the GPU server rental cost. Details of the specifications can be found below.