Update README.md
Browse files
README.md
CHANGED
@@ -118,7 +118,7 @@ The model, NT-Java-1.1B, has been trained on publicly available datasets and com
|
|
118 |
## Model
|
119 |
|
120 |
- **Architecture:** GPT-2 model with Multi-Query Attention and Fill-in-the-Middle objective
|
121 |
-
-
|
122 |
- **Pretraining tokens:** 22 Billion
|
123 |
- **Precision:** bfloat16
|
124 |
|
|
|
118 |
## Model
|
119 |
|
120 |
- **Architecture:** GPT-2 model with Multi-Query Attention and Fill-in-the-Middle objective
|
121 |
+
- **Pretraining steps:** 50k
|
122 |
- **Pretraining tokens:** 22 Billion
|
123 |
- **Precision:** bfloat16
|
124 |
|