Running 23 Weight-Space Geometry of Offline Reasoning Training 🧭 23 Interactive weight-space geometry of six reasoning losses
AlexWortega/ml-intern-v4-100m-tinystories-20260512-1721 Text Generation • 0.1B • Updated May 12 • 1.71k • 3
Running on CPU Upgrade Featured 3.21k The Smol Training Playbook 📚 3.21k The secrets to building world-class LLMs