WebThinker: Empowering Large Reasoning Models with Deep Research Capability Paper • 2504.21776 • Published 8 days ago • 41
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published 9 days ago • 89
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published 21 days ago • 88
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published Sep 25, 2024 • 114
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18 • 124
RLVR Collection Model and data for 'Expanding RL with Verifiable Rewards Across Diverse Domains' • 3 items • Updated Mar 31 • 11
ReSearch Collection Trained models as described in the paper "ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning" • 5 items • Updated Mar 27 • 5
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 236
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published Mar 10 • 98
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 • 406