AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Paper • 2504.01014 • Published Apr 1 • 68
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published Mar 31 • 62
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking Paper • 2502.20730 • Published Feb 28 • 40
closedagi/gpt5-r3-claude4-magnum-dong-slerp-raceplay-dpo-4.5bpw-1.1T-exl2-rpcal Updated Feb 2 • 5 • 20
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published Jan 30 • 61
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 121
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 289
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 276
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 97