Small Reasoning Models
Small reasoning models aim to achieve strong reasoning, coding, or math performance with relatively compact parameter counts.
Sources in this batch
VibeThinker-3Bis a technical report on a 3B dense model for verifiable reasoning, using curriculum SFT, multi-domain RL, and offline self-distillation. The abstract reports strong AIME26 and LiveCodeBench results.- Maxime Labonne’s talk,
Everything I Learned Training Frontier Small Models, is a relevant video source for practical small-model training lessons.
Research interest
The surprising claim is that a 3B model can approach first-tier reasoning-system performance on some verifiable tasks when trained with curriculum SFT, RL, self-distillation, and test-time scaling. The research question is whether this reflects a general recipe for compressing reasoning ability or a benchmark/task-specific effect. This page should track replications, ablations, and evidence of out-of-distribution robustness.
Why it matters
Small reasoning models are important for local inference, lower serving cost, privacy-sensitive deployments, and fast iteration. They connect model training choices to local-ai-benchmarks and production constraints.