ProX Refining Models Collection Adapted small language models used to generate data refining programs • 5 items • Updated Oct 10, 2024 • 3
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients Paper • 2504.10766 • Published 24 days ago • 40
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published 24 days ago • 84
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 48
Modifying Large Language Model Post-Training for Diverse Creative Writing Paper • 2503.17126 • Published Mar 21 • 36
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation Paper • 2502.10341 • Published Feb 14 • 2
JudgeBench: A Benchmark for Evaluating LLM-based Judges Paper • 2410.12784 • Published Oct 16, 2024 • 48