interesting papers - a engineerA314 Collection

engineerA314 's Collections

interesting papers

interesting papers

updated Mar 20

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 140
Agency Is Frame-Dependent

Paper • 2502.04403 • Published Feb 6 • 23
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 48
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published Feb 12 • 28
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

Paper • 2502.18890 • Published Feb 26 • 30
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published Mar 3 • 38
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 147
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 162
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published Mar 10 • 44
Implicit Reasoning in Transformers is Reasoning through Shortcuts

Paper • 2503.07604 • Published Mar 10 • 22