Latent Reasoning
Topics
Notes
Linked
Deep Supervision with Recursion
Train a model with recursion without large costs of BPTT
Tiny Reasoning Model (TRM)
Small transformers lack multi-step reasoning ability. Use latent recurrence in hidden states instead of explicit chain-of-thought tokens.
Hierarchical Reasoning Model (HRM)
Flat latent reasoning has limited depth. Add hierarchical structure to latent recurrence for deeper reasoning in transformers.