Spaces:

OpenLLM-Ro
/

README

Running

mihaimasala commited on Aug 23, 2024

Commit

e25ad1d

verified ·

1 Parent(s): 5fde02d

Add data collections info

Files changed (1) hide show

README.md CHANGED Viewed

@@ -13,12 +13,15 @@ We value:
 - using public and open corpora
 - open-source training and evaluation code.
-In this organization, you can find RoLLM models, based on different underlying models and in different flavours (i.e., foundational, instruct, or chat variants). There are currently two collections:
 - RoLlama2: Romanian models based on Llama2
 - RoMistral: Romanian models based on Mistral
 - RoGemma: Romanian models based on Gemma
 - RoLlama3: Romanian models based on Llama3
 See details in [https://arxiv.org/abs/2406.18266](https://arxiv.org/abs/2406.18266) and [https://arxiv.org/abs/2405.07703](https://arxiv.org/abs/2405.07703).

 - using public and open corpora
 - open-source training and evaluation code.
+In this organization, you can find RoLLM models, based on different underlying models and in different flavours (i.e., foundational, instruct, or chat variants). There are currently four model collections:
 - RoLlama2: Romanian models based on Llama2
 - RoMistral: Romanian models based on Mistral
 - RoGemma: Romanian models based on Gemma
 - RoLlama3: Romanian models based on Llama3
+Furthermore, here you can find data that was used for training and evaluation LLMs in Romanian. Currently, there are two data collections:
+- SFT datasets: data used for supervised (instruction) finetuning
+- Evaluation datasets: data used for evaluating LLM in Romanian
 See details in [https://arxiv.org/abs/2406.18266](https://arxiv.org/abs/2406.18266) and [https://arxiv.org/abs/2405.07703](https://arxiv.org/abs/2405.07703).