voroninip commited on
Commit
882bfd5
·
verified ·
1 Parent(s): 6fa8b0e

End of training

Browse files
Files changed (4) hide show
  1. README.md +17 -17
  2. config.json +1 -2
  3. model.safetensors +1 -1
  4. training_args.bin +2 -2
README.md CHANGED
@@ -9,8 +9,6 @@ metrics:
9
  model-index:
10
  - name: bert-paper-classifier-arxiv
11
  results: []
12
- datasets:
13
- - arxiv-community/arxiv_dataset
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -20,8 +18,8 @@ should probably proofread and complete it, then remove this comment. -->
20
 
21
  This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 3.7652
24
- - Accuracy: 0.31
25
 
26
  ## Model description
27
 
@@ -41,28 +39,30 @@ More information needed
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 5e-05
44
- - train_batch_size: 64
45
- - eval_batch_size: 8
46
  - seed: 42
47
- - gradient_accumulation_steps: 2
48
- - total_train_batch_size: 128
49
  - optimizer: Use OptimizerNames.ADAFACTOR and the args are:
50
  No additional optimizer arguments
51
  - lr_scheduler_type: cosine
52
- - num_epochs: 2
53
- - mixed_precision_training: Native AMP
54
 
55
  ### Training results
56
 
57
- | Training Loss | Epoch | Step | Validation Loss | Accuracy |
58
- |:-------------:|:-----:|:----:|:---------------:|:--------:|
59
- | No log | 1.0 | 8 | 4.0299 | 0.31 |
60
- | No log | 1.8 | 14 | 3.7652 | 0.31 |
 
 
 
61
 
62
 
63
  ### Framework versions
64
 
65
- - Transformers 4.48.3
66
- - Pytorch 2.5.1+cu124
67
  - Datasets 3.5.0
68
- - Tokenizers 0.21.0
 
9
  model-index:
10
  - name: bert-paper-classifier-arxiv
11
  results: []
 
 
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
18
 
19
  This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 1.1651
22
+ - Accuracy: 0.6854
23
 
24
  ## Model description
25
 
 
39
 
40
  The following hyperparameters were used during training:
41
  - learning_rate: 5e-05
42
+ - train_batch_size: 128
43
+ - eval_batch_size: 16
44
  - seed: 42
45
+ - gradient_accumulation_steps: 3
46
+ - total_train_batch_size: 384
47
  - optimizer: Use OptimizerNames.ADAFACTOR and the args are:
48
  No additional optimizer arguments
49
  - lr_scheduler_type: cosine
50
+ - num_epochs: 5
 
51
 
52
  ### Training results
53
 
54
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy |
55
+ |:-------------:|:------:|:----:|:---------------:|:--------:|
56
+ | No log | 1.0 | 97 | 1.5356 | 0.6095 |
57
+ | 2.3002 | 2.0 | 194 | 1.3083 | 0.6656 |
58
+ | 1.5053 | 3.0 | 291 | 1.2135 | 0.6822 |
59
+ | 1.3177 | 4.0 | 388 | 1.1702 | 0.6846 |
60
+ | 1.1984 | 4.9550 | 480 | 1.1651 | 0.6854 |
61
 
62
 
63
  ### Framework versions
64
 
65
+ - Transformers 4.50.3
66
+ - Pytorch 2.6.0+cu124
67
  - Datasets 3.5.0
68
+ - Tokenizers 0.21.1
config.json CHANGED
@@ -1,5 +1,4 @@
1
  {
2
- "_name_or_path": "microsoft/deberta-v3-base",
3
  "architectures": [
4
  "DebertaV2ForSequenceClassification"
5
  ],
@@ -286,7 +285,7 @@
286
  "relative_attention": true,
287
  "share_att_key": true,
288
  "torch_dtype": "float32",
289
- "transformers_version": "4.48.3",
290
  "type_vocab_size": 0,
291
  "vocab_size": 128100
292
  }
 
1
  {
 
2
  "architectures": [
3
  "DebertaV2ForSequenceClassification"
4
  ],
 
285
  "relative_attention": true,
286
  "share_att_key": true,
287
  "torch_dtype": "float32",
288
+ "transformers_version": "4.50.3",
289
  "type_vocab_size": 0,
290
  "vocab_size": 128100
291
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7f3fb104f4ad4cff08e2ea25577c49fbeaddfcc3e35e1c4d32e3404a53f72735
3
  size 738100720
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a8b7d00bce34e155430689ef88afc5c1e584376baa041d826656bc856329ad72
3
  size 738100720
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b4c90c3ee118220b93359f38e74e9dd3ff5792a4636d790cbb5edad798236fff
3
- size 5304
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6ce3aab3f6393c7f4627d687668ded8e3aa2050a9dada8059243cbbbfc504ea
3
+ size 5368