The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild"
HKUST NLP Group
university
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
7
models
45
hkust-nlp/Llama-3.1-8B-SimpleRL-Zoo
Updated
•
51
hkust-nlp/Qwen-2.5-32B-SimpleRL-Zoo
Updated
•
437
hkust-nlp/Qwen-2.5-7B-SimpleRL-Zoo
Updated
•
1.17k
hkust-nlp/DeepSeek-Math-7B-SimpleRL-Zoo
Updated
•
72
hkust-nlp/Mistral-7B-v0.1-SimpleRL-Zoo
Updated
•
117
hkust-nlp/Qwen-2.5-1.5B-SimpleRL-Zoo
Updated
•
1.24k
hkust-nlp/Qwen-2.5-0.5B-SimpleRL-Zoo
Updated
•
64
hkust-nlp/Qwen-2.5-14B-SimpleRL-Zoo
Updated
•
350
hkust-nlp/Mistral-Small-24B-SimpleRL-Zoo
Updated
•
41
hkust-nlp/Qwen-2.5-Math-7B-SimpleRL-Zoo
Updated
•
3.37k
datasets
23
hkust-nlp/GUIMid
Viewer
•
Updated
•
1.7M
•
177
•
2
hkust-nlp/SimpleRL-Zoo-Data
Viewer
•
Updated
•
53.1k
•
1.42k
•
4
hkust-nlp/PreSelect-100B
Viewer
•
Updated
•
54.5M
•
731
•
9
hkust-nlp/CodeIO-PyEdu-Reasoning
Preview
•
Updated
•
142
•
49
hkust-nlp/CodeIO-PyEdu-Reasoning-Raw
Updated
•
41
hkust-nlp/SynCSE-partial-NLI
Viewer
•
Updated
•
263k
•
29
•
2
hkust-nlp/SynCSE-scratch-NLI
Viewer
•
Updated
•
276k
•
14
•
2
hkust-nlp/gsm8k-fix
Viewer
•
Updated
•
7.47k
•
83
•
2
hkust-nlp/dart-math-uniform
Viewer
•
Updated
•
591k
•
59
•
9
hkust-nlp/vrt-baseline
Viewer
•
Updated
•
591k
•
35
•
1