HuggingFace Smol Training Playbook
Written as a "journey" on the SmolLM3 (3B) case: what was trained, what data mixtures were used, and what was done after pre-training.
Written as a "journey" on the SmolLM3 (3B) case: what was trained, what data mixtures were used, and what was done after pre-training.