AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset Paper • 2504.16891 • Published 15 days ago • 19
OpenMathReasoning Collection Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset" • 7 items • Updated 3 days ago • 36
Effective Backdoor Mitigation in Vision-Language Models Depends on the Pre-training Objective Paper • 2311.14948 • Published Nov 25, 2023
Countering Language Drift with Seeded Iterated Learning Paper • 2003.12694 • Published Mar 28, 2020 • 1
Recall Traces: Backtracking Models for Efficient Reinforcement Learning Paper • 1804.00379 • Published Apr 2, 2018
Supervised Seeded Iterated Learning for Interactive Language Learning Paper • 2010.02975 • Published Oct 6, 2020
Countering Language Drift with Seeded Iterated Learning Paper • 2003.12694 • Published Mar 28, 2020 • 1
Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment Paper • 2502.00203 • Published Jan 31 • 2
Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment Paper • 2502.00203 • Published Jan 31 • 2
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 13
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 13