Running 2.56k 2.56k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Paper • 2502.10391 • Published Feb 14 • 35
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 156
Fast Inference from Transformers via Speculative Decoding Paper • 2211.17192 • Published Nov 30, 2022 • 5