XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning Paper • 2005.00333 • Published May 1, 2020
Distilling Efficient Language-Specific Models for Cross-Lingual Transfer Paper • 2306.01709 • Published Jun 2, 2023 • 1
Composable Sparse Fine-Tuning for Cross-Lingual Transfer Paper • 2110.07560 • Published Oct 14, 2021 • 1
FaithDial: A Faithful Benchmark for Information-Seeking Dialogue Paper • 2204.10757 • Published Apr 22, 2022 • 1
Fine-tuning Large Language Models with Sequential Instructions Paper • 2403.07794 • Published Mar 12, 2024
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference Paper • 2403.09636 • Published Mar 14, 2024 • 3
Multi-Head Adapter Routing for Cross-Task Generalization Paper • 2211.03831 • Published Nov 7, 2022 • 2
Cross-Tokenizer Distillation via Approximate Likelihood Matching Paper • 2503.20083 • Published Mar 25 • 1
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs Paper • 2504.17768 • Published 14 days ago • 12