updated model summary
Browse files
README.md
CHANGED
@@ -26,7 +26,7 @@ widget:
|
|
26 |
|
27 |
## Model Summary
|
28 |
|
29 |
-
The Narrow Transformer (NT) model NT-Java-1.1B is an open-source specialized code model built on
|
30 |
|
31 |
- **Repository:** [bigcode/Megatron-LM](https://github.com/bigcode-project/Megatron-LM)
|
32 |
- **Project Website:**
|
|
|
26 |
|
27 |
## Model Summary
|
28 |
|
29 |
+
The Narrow Transformer (NT) model NT-Java-1.1B is an open-source specialized code model built by extending pre-training on starcoderbase-1b, designed for code related tasks in Java programming. The model is a decoder-only transformer with Multi-Query-Attention and a context length of 8192 tokens. The model has been trained with Java subset of the starcoderdata dataset, which is ~22B tokens.
|
30 |
|
31 |
- **Repository:** [bigcode/Megatron-LM](https://github.com/bigcode-project/Megatron-LM)
|
32 |
- **Project Website:**
|