Reinforcement Learning - a cp138 Collection

cp138 's Collections

Reinforcement Learning

Reinforcement Learning

updated Dec 17, 2024

Solving math word problems with process- and outcome-based feedback

Paper • 2211.14275 • Published Nov 25, 2022 • 9
Running

558

558

Scaling test-time compute

📈

Enhance math problem solving by scaling test-time compute