Fill-Mask
Transformers
Safetensors
Japanese
modernbert
speed commited on
Commit
84e67a0
·
verified ·
1 Parent(s): 3a6d756

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -4
README.md CHANGED
@@ -70,10 +70,6 @@ Training code can be found at https://github.com/llm-jp/llm-jp-modernbert
70
 
71
  The blank in stage 2 indicate the same value as in stage 1.
72
 
73
- In theory, stage 1 consumes 1.7T tokens, but sentences with fewer than 1024 tokens are truncated, so the actual consumption is lower. Stage 2 theoretically consumes 0.6T tokens.
74
-
75
- For reference, [ModernBERT](https://arxiv.org/abs/2412.13663) uses 1.72T tokens for stage 1, 250B tokens for stage 2, and 50B tokens for stage 3.
76
-
77
  ## Evaluation
78
 
79
  JSTS, JNLI, and JCoLA from [JGLUE](https://aclanthology.org/2022.lrec-1.317/) were used.
 
70
 
71
  The blank in stage 2 indicate the same value as in stage 1.
72
 
 
 
 
 
73
  ## Evaluation
74
 
75
  JSTS, JNLI, and JCoLA from [JGLUE](https://aclanthology.org/2022.lrec-1.317/) were used.