ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations Paper β’ 2505.02819 β’ Published 3 days ago β’ 23
π March 2025 - Open releases from the Chinese community Collection 30 items β’ Updated Apr 2 β’ 12
How far can we go with ImageNet for Text-to-Image generation? Paper β’ 2502.21318 β’ Published Feb 28 β’ 26
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper β’ 2502.08235 β’ Published Feb 12 β’ 58
FoNE: Precise Single-Token Number Embeddings via Fourier Features Paper β’ 2502.09741 β’ Published Feb 13 β’ 13
Aira Collection Aira is a series of chatbots developed as an experimentation playground for value alignment. β’ 27 items β’ Updated Jun 20, 2024 β’ 1
Loxa Collection a Loxa family models are best models to running on CPU and GPU with high quality(=>92% accuracy) β’ 5 items β’ Updated Feb 3 β’ 2
Quadrifoglio π Collection Small text2text models finetuned on Italian machine translation tasks. β’ 6 items β’ Updated Jan 12 β’ 1
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published Dec 18, 2024 β’ 149
RedPajama: an Open Dataset for Training Large Language Models Paper β’ 2411.12372 β’ Published Nov 19, 2024 β’ 56
FluidML: Fast and Memory Efficient Inference Optimization Paper β’ 2411.09242 β’ Published Nov 14, 2024 β’ 1
TΓLU 3: Pushing Frontiers in Open Language Model Post-Training Paper β’ 2411.15124 β’ Published Nov 22, 2024 β’ 63