view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation Apr 29, 2024 โข 77
LocAgent: Graph-Guided LLM Agents for Code Localization Paper โข 2503.09089 โข Published Mar 12 โข 10
bigcode/self-oss-instruct-sc2-exec-filter-50k Viewer โข Updated Nov 4, 2024 โข 50.7k โข 329 โข 99
UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function Paper โข 2410.21438 โข Published Oct 28, 2024 โข 2
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper โข 2503.10460 โข Published Mar 13 โข 28
Running 2.56k 2.56k The Ultra-Scale Playbook ๐ The ultimate guide to training LLM on large GPU Clusters
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper โข 2502.02737 โข Published Feb 4 โข 229
Running 63 63 Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks ๐ Evaluate multilingual models using FineTasks
Code Evaluation Collection Collection of Papers on Code Evaluation (from code generation language models) โข 45 items โข Updated Oct 29, 2024 โข 15