citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_gaussian_0.75_0.25_True_1600 Text Generation • Updated about 1 hour ago
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_balanced_0.5_0.5_True_1600 Text Generation • Updated about 14 hours ago
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_gaussian_0.25_0.75_True_1600 Updated about 21 hours ago
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_cosine_0.5_0.5_True_1600 Text Generation • Updated 1 day ago
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_classic_0.5_0.5_True_1600 Text Generation • Updated 2 days ago
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_gaussian_0.5_0.5_True_1600 Text Generation • Updated 3 days ago • 2
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_gaussian_0.75_0.25_True_1600 Text Generation • Updated 7 days ago • 1
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_gaussian_0_25_0_75_True_1600 Text Generation • Updated 8 days ago • 1