praveenkumarb BalajiInfosys commited on
Commit
747372a
·
verified ·
1 Parent(s): 8740e1b

Update README.md (#1)

Browse files

- Update README.md (98e2e344d7fd5c819137899d4e840c8d29a9f670)


Co-authored-by: A J <[email protected]>

Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -26,7 +26,7 @@ widget:
26
 
27
  ## Model Summary
28
 
29
- The JavaCoder models are !B parameter models trained on 80+ programming languages from [The Stack (v1.2)](https://huggingface.co/datasets/bigcode/the-stack), with opt-out requests excluded. The model uses [Multi Query Attention](https://arxiv.org/abs/1911.02150), [a context window of 8192 tokens](https://arxiv.org/abs/2205.14135), and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 1 trillion tokens.
30
 
31
  - **Repository:**
32
  - **Project Website:**
@@ -88,7 +88,7 @@ The model has been trained on source code from 80+ programming languages. The pr
88
  ## Hardware
89
 
90
  - **GPUs:** 6 NVIDIA A100 80GB
91
- - **Training time:** days
92
 
93
  ## Software
94
 
 
26
 
27
  ## Model Summary
28
 
29
+ The JavaCoder models are 1B parameter models trained on 80+ programming languages from [The Stack (v1.2)](https://huggingface.co/datasets/bigcode/the-stack), with opt-out requests excluded. The model uses [Multi Query Attention](https://arxiv.org/abs/1911.02150), [a context window of 8192 tokens](https://arxiv.org/abs/2205.14135), and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 1 trillion tokens.
30
 
31
  - **Repository:**
32
  - **Project Website:**
 
88
  ## Hardware
89
 
90
  - **GPUs:** 6 NVIDIA A100 80GB
91
+ - **Training time:** 4 days
92
 
93
  ## Software
94