File size: 3,764 Bytes
e536cc8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
library_name: transformers
license: apache-2.0
base_model: google-bert/bert-large-uncased
tags:
- generated_from_trainer
metrics:
- accuracy
- precision
- recall
- f1
model-index:
- name: math_question_grade_detection_Bert_databalanced
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# math_question_grade_detection_Bert_databalanced

This model is a fine-tuned version of [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.6880
- Accuracy: 0.7603
- Precision: 0.7651
- Recall: 0.7603
- F1: 0.7588

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- training_steps: 1100

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Accuracy | Precision | Recall | F1     |
|:-------------:|:------:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
| No log        | 0.2817 | 50   | 2.1003          | 0.2349   | 0.3799    | 0.2349 | 0.2106 |
| No log        | 0.5634 | 100  | 1.9607          | 0.2762   | 0.3337    | 0.2762 | 0.2498 |
| No log        | 0.8451 | 150  | 1.5031          | 0.4778   | 0.4633    | 0.4778 | 0.4591 |
| No log        | 1.1268 | 200  | 1.2546          | 0.5460   | 0.5596    | 0.5460 | 0.5176 |
| No log        | 1.4085 | 250  | 1.0941          | 0.5746   | 0.5804    | 0.5746 | 0.5675 |
| No log        | 1.6901 | 300  | 0.9381          | 0.6730   | 0.6943    | 0.6730 | 0.6721 |
| No log        | 1.9718 | 350  | 0.8974          | 0.6619   | 0.6822    | 0.6619 | 0.6570 |
| No log        | 2.2535 | 400  | 0.8243          | 0.6889   | 0.6913    | 0.6889 | 0.6856 |
| No log        | 2.5352 | 450  | 0.8219          | 0.6937   | 0.7131    | 0.6937 | 0.6881 |
| 1.2537        | 2.8169 | 500  | 0.7642          | 0.7159   | 0.7239    | 0.7159 | 0.7121 |
| 1.2537        | 3.0986 | 550  | 0.7580          | 0.7175   | 0.7197    | 0.7175 | 0.7068 |
| 1.2537        | 3.3803 | 600  | 0.7310          | 0.7397   | 0.7523    | 0.7397 | 0.7387 |
| 1.2537        | 3.6620 | 650  | 0.7562          | 0.7413   | 0.7466    | 0.7413 | 0.7349 |
| 1.2537        | 3.9437 | 700  | 0.6512          | 0.7730   | 0.7792    | 0.7730 | 0.7726 |
| 1.2537        | 4.2254 | 750  | 0.6941          | 0.7476   | 0.7484    | 0.7476 | 0.7447 |
| 1.2537        | 4.5070 | 800  | 0.6866          | 0.7571   | 0.7607    | 0.7571 | 0.7550 |
| 1.2537        | 4.7887 | 850  | 0.6942          | 0.7603   | 0.7644    | 0.7603 | 0.7588 |
| 1.2537        | 5.0704 | 900  | 0.7230          | 0.7683   | 0.7821    | 0.7683 | 0.7656 |
| 1.2537        | 5.3521 | 950  | 0.7123          | 0.7603   | 0.7669    | 0.7603 | 0.7588 |
| 0.321         | 5.6338 | 1000 | 0.6939          | 0.7667   | 0.7725    | 0.7667 | 0.7652 |
| 0.321         | 5.9155 | 1050 | 0.6884          | 0.7667   | 0.7723    | 0.7667 | 0.7657 |
| 0.321         | 6.1972 | 1100 | 0.6880          | 0.7603   | 0.7651    | 0.7603 | 0.7588 |


### Framework versions

- Transformers 4.46.3
- Pytorch 2.4.0
- Datasets 3.1.0
- Tokenizers 0.20.3