Small Languages, Big Models: A Study of Continual Training on Languages of Norway Paper โข 2412.06484 โข Published Dec 9, 2024
Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles Paper โข 2501.07718 โข Published Jan 13
Beemo: Benchmark of Expert-edited Machine-generated Outputs Paper โข 2411.04032 โข Published Nov 6, 2024
An Expanded Massive Multilingual Dataset for High-Performance Language Technologies Paper โข 2503.10267 โข Published Mar 13
A Family of Pretrained Transformer Language Models for Russian Paper โข 2309.10931 โข Published Sep 19, 2023 โข 5
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark Paper โข 2010.15925 โข Published Oct 29, 2020
Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models Paper โข 2202.07791 โข Published Feb 15, 2022
Findings of the The RuATD Shared Task 2022 on Artificial Text Detection in Russian Paper โข 2206.01583 โข Published Jun 3, 2022 โข 1
Vote'n'Rank: Revision of Benchmarking with Social Choice Theory Paper โข 2210.05769 โข Published Oct 11, 2022
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper โข 2502.14499 โข Published Feb 20 โข 192
Towards Best Practices for Open Datasets for LLM Training Paper โข 2501.08365 โข Published Jan 14 โข 63
The Impact of Copyrighted Material on Large Language Models: A Norwegian Perspective Paper โข 2412.09460 โข Published Dec 12, 2024 โข 8
view post Post 854 Just updated the cozy HF Daily Papers review page.โ Affiliations extraction (filters are coming soon)โ Redesignโ Top by month page๐ Syncing every 1 hour๐ https://hfday.ruYour feedback is appreciated. 2 replies ยท ๐ 2 2 ๐ 1 1 + Reply
RedPajama: an Open Dataset for Training Large Language Models Paper โข 2411.12372 โข Published Nov 19, 2024 โข 56