Spaces:

Ruadapt
/

README

Running

App Files Files Community

RefalMachine commited on Mar 22

Commit

25033eb

verified ·

1 Parent(s): c79fbcd

Update README.md

Browse files

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -16,9 +16,8 @@ In addition to developing the methodology itself, we also employ it to adapt exi
 One of the unique features of our approach to adaptation lies in the fact that, thanks to the LEP method - Learned Embedding Propagation (see paper), we adapt the base version of the model just once and can then very affordably adapt any instructive version derived from this base. For instance, after adapting Qwen2.5-32B, we managed to obtain RaadaptQwen2.5 versions not only for Qwen2.5-32B-Instruct but also for QwQ-32B-Preview, QwQ-32B, FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview (while preserving reasoning capabilities), and T-pro-it-1.0.
 An intriguing aspect of adapting T-pro-it-1.0 is that this model was obtained through continuous pretraining on over 100 billion tokens of Russian-language data using full fine-tuning. Despite this extensive prior training, our methodology still worked effectively (note: the original base model Qwen2.5-32B was adapted!), and the resulting adapted version either outperformed or matched T-pro-it-1.0 on several benchmarks. Moreover, it demonstrated higher efficiency in Russian-language tokenization.
-<div style="text-align: center">
-<img src="https://cdn-uploads.huggingface.co/production/uploads/652cedbdf120598322ae358a/sKwHvA9ztd7rHx37Ca2ey.png" style="max-width: 50%; height: auto;">
-</div>
 ## Papers
 Tikhomirov M., Chernyshov D. Facilitating Large Language Model Russian Adaptation with Learned Embedding Propagation //Journal of Language and Education. – 2024. – Т. 10. – №. 4. – С. 130-145. (Preprint: https://arxiv.org/abs/2412.21140)

 One of the unique features of our approach to adaptation lies in the fact that, thanks to the LEP method - Learned Embedding Propagation (see paper), we adapt the base version of the model just once and can then very affordably adapt any instructive version derived from this base. For instance, after adapting Qwen2.5-32B, we managed to obtain RaadaptQwen2.5 versions not only for Qwen2.5-32B-Instruct but also for QwQ-32B-Preview, QwQ-32B, FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview (while preserving reasoning capabilities), and T-pro-it-1.0.
 An intriguing aspect of adapting T-pro-it-1.0 is that this model was obtained through continuous pretraining on over 100 billion tokens of Russian-language data using full fine-tuning. Despite this extensive prior training, our methodology still worked effectively (note: the original base model Qwen2.5-32B was adapted!), and the resulting adapted version either outperformed or matched T-pro-it-1.0 on several benchmarks. Moreover, it demonstrated higher efficiency in Russian-language tokenization.
+<img src="https://cdn-uploads.huggingface.co/production/uploads/652cedbdf120598322ae358a/sKwHvA9ztd7rHx37Ca2ey.png" style="display: block; margin: 0 auto; max-width: 50%; height: auto;">
 ## Papers
 Tikhomirov M., Chernyshov D. Facilitating Large Language Model Russian Adaptation with Learned Embedding Propagation //Journal of Language and Education. – 2024. – Т. 10. – №. 4. – С. 130-145. (Preprint: https://arxiv.org/abs/2412.21140)