PartAI
/

TookaBERT-Base

Model card Files Files and versions Community

mohalisad commited on Dec 8, 2024

Commit

fa5ca89

·

verified ·

1 Parent(s): 349768e

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -16,6 +16,8 @@ widget:
 TookaBERT models are a family of encoder models trained on Persian in two sizes base and large. These Models pre-trained on over 500GB of Persian data including a variety of topics such as News, Blogs, Forums, Books, etc. They pre-trained with the MLM (WWM) objective using two context lengths.
 ## How to use
 You can use this model directly for Masked Language Modeling using the provided code below.

 TookaBERT models are a family of encoder models trained on Persian in two sizes base and large. These Models pre-trained on over 500GB of Persian data including a variety of topics such as News, Blogs, Forums, Books, etc. They pre-trained with the MLM (WWM) objective using two context lengths.
+For more information you can read our paper on [arXiv](https://arxiv.org/abs/2407.16382).
 ## How to use
 You can use this model directly for Masked Language Modeling using the provided code below.