🌾Oat-Zero: Understanding R1-Zero-Like Training - a sail Collection

sail 's Collections

🚀 Active PRM

🌾Oat-Zero: Understanding R1-Zero-Like Training

🔱 Sailor2 Language Models

🧬 RegMix: Data Mixture as Regression

📈 Scaling Laws with Vocabulary

⚓️ Sailor Language Models

🌾Oat-Zero: Understanding R1-Zero-Like Training

updated 29 days ago