SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper โข 2502.02737 โข Published Feb 4 โข 229
view post Post 1458 Hey ๐ I'm helping out on some community research to learn about the AI community. If you want to join in the conversation, head over here where I started a community discussion on the most influential model since BERT. OSAIResearchCommunity/README#2 See translation ๐ 2 2 + Reply
view post Post 4404 I have just released a new blogpost about kv caching and its role in inference speedup ๐๐ https://huggingface.co/blog/not-lain/kv-caching/some takeaways : See translation 4 replies ยท ๐ฅ 8 8 ๐ค 4 4 + Reply