silence09 commited on
Commit
a042fd0
·
verified ·
1 Parent(s): dc69c52

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -11,7 +11,7 @@ This project is created using the official **Deepseek R1** model script (`modeli
11
  The three hidden layers consist of:
12
  - **A hidden layer: MLA + Dense MLP**
13
  - **A hidden layer: MLA + MoE (Mixture of Experts) MLP**
14
- - **A MTP (Multi-Token Pretraining) layer (MTP can be regarded or used for speculative decoding in inference) **
15
 
16
  ## Purpose
17
  The purpose of these weights is to provide a lightweight implementation for researchers who want to study the model architecture and run experiments quickly.
 
11
  The three hidden layers consist of:
12
  - **A hidden layer: MLA + Dense MLP**
13
  - **A hidden layer: MLA + MoE (Mixture of Experts) MLP**
14
+ - **A MTP (Multi-Token Pretraining) layer (MTP can be regarded or used for speculative decoding in inference)**
15
 
16
  ## Purpose
17
  The purpose of these weights is to provide a lightweight implementation for researchers who want to study the model architecture and run experiments quickly.