A new tiny 7-million parameter model called TRM achieves superior reasoning performance on hard benchmarks like ARC-AGI-1 and ARC-AGI-2, outperforming much larger frontier models by simplifying recursive reasoning.
Takeways• A 7-million parameter model (TRM) beats larger frontier models on hard reasoning tasks.
• TRM uses a simplified recursive reasoning approach with iterative self-critique, not complex biological arguments.
• This 'recursion as a scaling law' could make advanced AI accessible on personal devices.
A groundbreaking 7-million parameter model, TRM (Tiny Recursive Model), developed by Samsung, has surpassed larger models like Gemini 2.5 Pro and DeepSeek R1 on complex reasoning benchmarks. This achievement is attributed to a simplified recursive reasoning approach that allows the model to refine its answers through iterative self-critique, a method proving more efficient than traditional large language model scaling laws. TRM's success suggests a potential new path towards Artificial General Intelligence with significantly smaller and more accessible models.
Limitations of LLMs in Reasoning
• 00:01:03 Large language models frequently struggle with hard reasoning problems because they generate answers autoregressively by predicting the next token, which carries a high risk of error. Techniques like 'chain of thought' improve reliability by having models think through solutions, but these are expensive and can be brittle, while 'pass@K' generates multiple responses to choose the best one. These methods often mask the core issue that LLMs aren't truly reasoning but merely predicting.
Hierarchical Reasoning Models (HRM)
• 00:03:31 A new proposed path forward is Hierarchical Reasoning Models (HRM), which achieve high accuracy on puzzle tasks like Sudoku and ARC-AGI where LLMs falter. HRM introduces two novelties: recursive hierarchical reasoning, which loops multiple times through small networks, and deep supervision. The original HRM authors justified their approach with biological arguments, suggesting it mimics human brain function, but these explanations were later questioned for their applicability and lack of clarity on 'why' specific elements worked.
Tiny Recursive Model (TRM) Approach
• 00:08:40 TRM simplifies the recursive approach of HRM, requiring no complex mathematical theorems, hierarchies, or biological arguments, and generalizes better with only a single tiny network instead of two medium-sized ones. It focuses on the most important aspect: a feedback loop where the model keeps two memories—its current guess and the reasoning trace—and updates both with each recursion. This iterative self-critique mechanism allows it to continuously improve its answer, similar to trying a move in Sudoku, reflecting, and adjusting.
TRM Performance and Implications
• 00:11:32 TRM significantly outperforms HRM by showing greater improvement with less depth in recursions. It achieved 44.6% on ARC-AGI-1 and 7.8% on ARC-AGI-2, surpassing leading models like DeepSeek R1, Claude 3.7, and Gemini 2.5 Pro, despite being a mere fraction of their size (7 million parameters compared to trillion-parameter models). This suggests recursion might be a new scaling law, offering 'virtual depth' that enables powerful reasoning in smaller models, potentially making advanced AI accessible on everyday devices.