Fill-Mask
Transformers
PyTorch
roberta

Standard roberta-large model fine-tuned for one pass over the entire Pile dataset.

See Test-time training on nearest neighbors for large language models for details.

Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train socialfoundations/roberta-large-pile-lr2e-5-bs16-8gpu-1700000