arxiv:2503.19206

Overtrained Language Models Are Harder to Fine-Tune

Published on Mar 24

Authors:

Abstract

Large language models are pre-trained on ever-growing token budgets under the assumption that better pre-training performance translates to improved downstream models. In this work, we challenge this assumption and show that extended pre-training can make models harder to fine-tune, leading to degraded final performance. We term this phenomenon catastrophic overtraining. For example, the instruction-tuned OLMo-1B model pre-trained on 3T tokens leads to over 2% worse performance on multiple standard LLM benchmarks than its 2.3T token counterpart. Through controlled experiments and theoretical analysis, we show that catastrophic overtraining arises from a systematic increase in the broad sensitivity of pre-trained parameters to modifications, including but not limited to fine-tuning. Our findings call for a critical reassessment of pre-training design that considers the downstream adaptability of the model.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2503.19206 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2503.19206 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2503.19206 in a Space README.md to link it from this page.