SmolVLM: Redefining small and efficient multimodal models Paper ⢠2504.05299 ⢠Published 21 days ago ⢠176
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper ⢠2502.02737 ⢠Published Feb 4 ⢠229
Towards Best Practices for Open Datasets for LLM Training Paper ⢠2501.08365 ⢠Published Jan 14 ⢠61
SelfCodeAlign: Self-Alignment for Code Generation Paper ⢠2410.24198 ⢠Published Oct 31, 2024 ⢠25
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper ⢠2406.17557 ⢠Published Jun 25, 2024 ⢠96
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper ⢠2405.18392 ⢠Published May 28, 2024 ⢠12
StarCoder 2 and The Stack v2: The Next Generation Paper ⢠2402.19173 ⢠Published Feb 29, 2024 ⢠144
Power Hungry Processing: Watts Driving the Cost of AI Deployment? Paper ⢠2311.16863 ⢠Published Nov 28, 2023 ⢠6
What's in the Box? A Preliminary Analysis of Undesirable Content in the Common Crawl Corpus Paper ⢠2105.02732 ⢠Published May 6, 2021
OctoPack: Instruction Tuning Code Large Language Models Paper ⢠2308.07124 ⢠Published Aug 14, 2023 ⢠29
Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness Paper ⢠2302.10893 ⢠Published Feb 7, 2023 ⢠6
Evaluating the Social Impact of Generative AI Systems in Systems and Society Paper ⢠2306.05949 ⢠Published Jun 9, 2023 ⢠9
Quantifying the Carbon Emissions of Machine Learning Paper ⢠1910.09700 ⢠Published Oct 21, 2019 ⢠16
ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods Paper ⢠2110.02871 ⢠Published Oct 6, 2021