rawsh lbourdois commited on
Commit
e4d23b2
·
verified ·
1 Parent(s): d230f00

Improve language tag (#1)

Browse files

- Improve language tag (a06d7e47c59193ac1a28f9cfad850249b7ef8335)


Co-authored-by: Loïck BOURDOIS <[email protected]>

Files changed (1) hide show
  1. README.md +126 -112
README.md CHANGED
@@ -1,112 +1,126 @@
1
- ---
2
- license: apache-2.0
3
- base_model: Qwen/Qwen2.5-0.5B
4
- tags:
5
- - axolotl
6
- - generated_from_trainer
7
- model-index:
8
- - name: MetaMath-Qwen2.5-0.5b-PRM
9
- results: []
10
- ---
11
-
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
-
15
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
16
- <details><summary>See axolotl config</summary>
17
-
18
- axolotl version: `0.4.1`
19
- ```yaml
20
- base_model: Qwen/Qwen2.5-0.5B
21
- bf16: auto
22
- dataset_prepared_path: /training/data/prepared
23
- datasets:
24
- - conversation: llama3
25
- path: RLHFlow/Mistral-PRM-Data
26
- split: train
27
- train_on_split: train
28
- type: sharegpt
29
- flash_attention: true
30
- fp16: false
31
- gradient_accumulation_steps: 4
32
- gradient_checkpointing: true
33
- hub_model_id: rawsh/MetaMath-Qwen2.5-0.5b-PRM
34
- hub_strategy: every_save
35
- learning_rate: 2.0e-06
36
- load_in_4bit: false
37
- load_in_8bit: false
38
- logging_steps: 2
39
- lr_scheduler: cosine
40
- max_grad_norm: 1.0
41
- micro_batch_size: 1
42
- model_type: AutoModelForCausalLM
43
- num_epochs: 1
44
- optimizer: paged_adamw_32bit
45
- output_dir: /training/prm
46
- pad_to_sequence_len: true
47
- push_to_hub: true
48
- sample_packing: true
49
- save_safetensors: true
50
- save_strategy: epoch
51
- save_total_limit: 4
52
- sequence_len: 8192
53
- special_tokens:
54
- pad_token: <|endoftext|>
55
- strict: false
56
- tf32: true
57
- tokenizer_type: AutoTokenizer
58
- train_on_inputs: false
59
- trust_remote_code: true
60
- val_set_size: 0.0
61
- wandb_name: qwen2.5-0.5b-bs32_lr2e-6_prm
62
- wandb_project: preference-models
63
- warmup_ratio: 0.05
64
- weight_decay: 0.0
65
-
66
- ```
67
-
68
- </details><br>
69
-
70
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/dankgpt/preference-models/runs/eqqhapl0)
71
- # MetaMath-Qwen2.5-0.5b-PRM
72
-
73
- This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) on the None dataset.
74
-
75
- ## Model description
76
-
77
- More information needed
78
-
79
- ## Intended uses & limitations
80
-
81
- More information needed
82
-
83
- ## Training and evaluation data
84
-
85
- More information needed
86
-
87
- ## Training procedure
88
-
89
- ### Training hyperparameters
90
-
91
- The following hyperparameters were used during training:
92
- - learning_rate: 2e-06
93
- - train_batch_size: 1
94
- - eval_batch_size: 1
95
- - seed: 42
96
- - gradient_accumulation_steps: 4
97
- - total_train_batch_size: 4
98
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
99
- - lr_scheduler_type: cosine
100
- - lr_scheduler_warmup_steps: 214
101
- - num_epochs: 1
102
-
103
- ### Training results
104
-
105
-
106
-
107
- ### Framework versions
108
-
109
- - Transformers 4.42.3
110
- - Pytorch 2.3.0+cu121
111
- - Datasets 2.19.1
112
- - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen2.5-0.5B
4
+ tags:
5
+ - axolotl
6
+ - generated_from_trainer
7
+ language:
8
+ - zho
9
+ - eng
10
+ - fra
11
+ - spa
12
+ - por
13
+ - deu
14
+ - ita
15
+ - rus
16
+ - jpn
17
+ - kor
18
+ - vie
19
+ - tha
20
+ - ara
21
+ model-index:
22
+ - name: MetaMath-Qwen2.5-0.5b-PRM
23
+ results: []
24
+ ---
25
+
26
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
27
+ should probably proofread and complete it, then remove this comment. -->
28
+
29
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
30
+ <details><summary>See axolotl config</summary>
31
+
32
+ axolotl version: `0.4.1`
33
+ ```yaml
34
+ base_model: Qwen/Qwen2.5-0.5B
35
+ bf16: auto
36
+ dataset_prepared_path: /training/data/prepared
37
+ datasets:
38
+ - conversation: llama3
39
+ path: RLHFlow/Mistral-PRM-Data
40
+ split: train
41
+ train_on_split: train
42
+ type: sharegpt
43
+ flash_attention: true
44
+ fp16: false
45
+ gradient_accumulation_steps: 4
46
+ gradient_checkpointing: true
47
+ hub_model_id: rawsh/MetaMath-Qwen2.5-0.5b-PRM
48
+ hub_strategy: every_save
49
+ learning_rate: 2.0e-06
50
+ load_in_4bit: false
51
+ load_in_8bit: false
52
+ logging_steps: 2
53
+ lr_scheduler: cosine
54
+ max_grad_norm: 1.0
55
+ micro_batch_size: 1
56
+ model_type: AutoModelForCausalLM
57
+ num_epochs: 1
58
+ optimizer: paged_adamw_32bit
59
+ output_dir: /training/prm
60
+ pad_to_sequence_len: true
61
+ push_to_hub: true
62
+ sample_packing: true
63
+ save_safetensors: true
64
+ save_strategy: epoch
65
+ save_total_limit: 4
66
+ sequence_len: 8192
67
+ special_tokens:
68
+ pad_token: <|endoftext|>
69
+ strict: false
70
+ tf32: true
71
+ tokenizer_type: AutoTokenizer
72
+ train_on_inputs: false
73
+ trust_remote_code: true
74
+ val_set_size: 0.0
75
+ wandb_name: qwen2.5-0.5b-bs32_lr2e-6_prm
76
+ wandb_project: preference-models
77
+ warmup_ratio: 0.05
78
+ weight_decay: 0.0
79
+
80
+ ```
81
+
82
+ </details><br>
83
+
84
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/dankgpt/preference-models/runs/eqqhapl0)
85
+ # MetaMath-Qwen2.5-0.5b-PRM
86
+
87
+ This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B) on the None dataset.
88
+
89
+ ## Model description
90
+
91
+ More information needed
92
+
93
+ ## Intended uses & limitations
94
+
95
+ More information needed
96
+
97
+ ## Training and evaluation data
98
+
99
+ More information needed
100
+
101
+ ## Training procedure
102
+
103
+ ### Training hyperparameters
104
+
105
+ The following hyperparameters were used during training:
106
+ - learning_rate: 2e-06
107
+ - train_batch_size: 1
108
+ - eval_batch_size: 1
109
+ - seed: 42
110
+ - gradient_accumulation_steps: 4
111
+ - total_train_batch_size: 4
112
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
113
+ - lr_scheduler_type: cosine
114
+ - lr_scheduler_warmup_steps: 214
115
+ - num_epochs: 1
116
+
117
+ ### Training results
118
+
119
+
120
+
121
+ ### Framework versions
122
+
123
+ - Transformers 4.42.3
124
+ - Pytorch 2.3.0+cu121
125
+ - Datasets 2.19.1
126
+ - Tokenizers 0.19.1