bm_train1-8_eval21-25_lr1e-5

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3269
  • Accuracy: 0.546

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 7658372
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0 0 2.6414 0.0
2.6425 0.0064 100 2.6413 0.0
2.6424 0.0128 200 2.6410 0.0
2.6407 0.0192 300 2.6405 0.0
2.6374 0.0256 400 2.6398 0.0
2.6367 0.032 500 2.6389 0.0
2.6372 0.0384 600 2.6379 0.0
2.6372 0.0448 700 2.6366 0.0
2.634 0.0512 800 2.6350 0.0
2.6349 0.0576 900 2.6330 0.0
2.6337 0.064 1000 2.6308 0.0
2.6276 0.0704 1100 2.6284 0.0
2.6237 0.0768 1200 2.6256 0.0
2.6237 0.0832 1300 2.6227 0.0
2.6234 0.0896 1400 2.6195 0.0
2.6157 0.096 1500 2.6160 0.0
2.6118 0.1024 1600 2.6123 0.0
2.6095 0.1088 1700 2.6085 0.0
2.6063 0.1152 1800 2.6047 0.0
2.6017 0.1216 1900 2.6009 0.546
2.5971 0.128 2000 2.5971 0.546
2.5954 0.1344 2100 2.5933 0.546
2.5872 0.1408 2200 2.5895 0.546
2.5839 0.1472 2300 2.5857 0.546
2.5847 0.1536 2400 2.5819 0.546
2.5778 0.16 2500 2.5781 0.546
2.5714 0.1664 2600 2.5742 0.546
2.5699 0.1728 2700 2.5704 0.546
2.565 0.1792 2800 2.5666 0.546
2.5638 0.1856 2900 2.5628 0.546
2.5592 0.192 3000 2.5589 0.546
2.5564 0.1984 3100 2.5551 0.546
2.5486 0.2048 3200 2.5513 0.546
2.5454 0.2112 3300 2.5475 0.546
2.5448 0.2176 3400 2.5437 0.546
2.541 0.224 3500 2.5399 0.546
2.5337 0.2304 3600 2.5361 0.546
2.5337 0.2368 3700 2.5324 0.546
2.5278 0.2432 3800 2.5286 0.546
2.5233 0.2496 3900 2.5249 0.546
2.5214 0.256 4000 2.5212 0.546
2.5166 0.2624 4100 2.5175 0.546
2.5189 0.2688 4200 2.5138 0.546
2.5098 0.2752 4300 2.5101 0.546
2.507 0.2816 4400 2.5065 0.546
2.5015 0.288 4500 2.5028 0.546
2.4993 0.2944 4600 2.4992 0.546
2.4946 0.3008 4700 2.4957 0.546
2.4905 0.3072 4800 2.4921 0.546
2.4897 0.3136 4900 2.4886 0.546
2.4873 0.32 5000 2.4851 0.546
2.4822 0.3264 5100 2.4816 0.546
2.4801 0.3328 5200 2.4782 0.546
2.4784 0.3392 5300 2.4747 0.546
2.4728 0.3456 5400 2.4714 0.546
2.4686 0.352 5500 2.4680 0.546
2.4635 0.3584 5600 2.4647 0.546
2.4619 0.3648 5700 2.4613 0.546
2.4572 0.3712 5800 2.4581 0.546
2.4545 0.3776 5900 2.4548 0.546
2.4547 0.384 6000 2.4516 0.546
2.4482 0.3904 6100 2.4484 0.546
2.4453 0.3968 6200 2.4453 0.546
2.4399 0.4032 6300 2.4422 0.546
2.4417 0.4096 6400 2.4391 0.546
2.4361 0.416 6500 2.4361 0.546
2.436 0.4224 6600 2.4331 0.546
2.4293 0.4288 6700 2.4302 0.546
2.4264 0.4352 6800 2.4272 0.546
2.4241 0.4416 6900 2.4244 0.546
2.4206 0.448 7000 2.4215 0.546
2.4178 0.4544 7100 2.4187 0.546
2.4148 0.4608 7200 2.4160 0.546
2.4135 0.4672 7300 2.4132 0.546
2.4085 0.4736 7400 2.4106 0.546
2.4053 0.48 7500 2.4079 0.546
2.4044 0.4864 7600 2.4053 0.546
2.4016 0.4928 7700 2.4028 0.546
2.4 0.4992 7800 2.4003 0.546
2.3987 0.5056 7900 2.3978 0.546
2.393 0.512 8000 2.3954 0.546
2.3912 0.5184 8100 2.3930 0.546
2.3918 0.5248 8200 2.3907 0.546
2.3884 0.5312 8300 2.3884 0.546
2.3876 0.5376 8400 2.3861 0.546
2.3825 0.544 8500 2.3839 0.546
2.3833 0.5504 8600 2.3818 0.546
2.3817 0.5568 8700 2.3797 0.546
2.3791 0.5632 8800 2.3776 0.546
2.3759 0.5696 8900 2.3756 0.546
2.3751 0.576 9000 2.3737 0.546
2.3723 0.5824 9100 2.3717 0.546
2.3731 0.5888 9200 2.3699 0.546
2.3674 0.5952 9300 2.3680 0.546
2.3659 0.6016 9400 2.3663 0.546
2.3633 0.608 9500 2.3645 0.546
2.3637 0.6144 9600 2.3628 0.546
2.3594 0.6208 9700 2.3612 0.546
2.3637 0.6272 9800 2.3596 0.546
2.3574 0.6336 9900 2.3580 0.546
2.3595 0.64 10000 2.3565 0.546
2.355 0.6464 10100 2.3551 0.546
2.3515 0.6528 10200 2.3536 0.546
2.3514 0.6592 10300 2.3523 0.546
2.353 0.6656 10400 2.3509 0.546
2.3487 0.672 10500 2.3496 0.546
2.349 0.6784 10600 2.3484 0.546
2.3463 0.6848 10700 2.3472 0.546
2.3448 0.6912 10800 2.3460 0.546
2.3506 0.6976 10900 2.3449 0.546
2.3423 0.704 11000 2.3438 0.546
2.3467 0.7104 11100 2.3428 0.546
2.3415 0.7168 11200 2.3418 0.546
2.3402 0.7232 11300 2.3408 0.546
2.3381 0.7296 11400 2.3399 0.546
2.3393 0.736 11500 2.3390 0.546
2.337 0.7424 11600 2.3382 0.546
2.3365 0.7488 11700 2.3374 0.546
2.3366 0.7552 11800 2.3366 0.546
2.3374 0.7616 11900 2.3359 0.546
2.3376 0.768 12000 2.3352 0.546
2.3303 0.7744 12100 2.3346 0.546
2.3336 0.7808 12200 2.3339 0.546
2.3345 0.7872 12300 2.3333 0.546
2.3331 0.7936 12400 2.3328 0.546
2.331 0.8 12500 2.3323 0.546
2.3305 0.8064 12600 2.3318 0.546
2.3301 0.8128 12700 2.3313 0.546
2.3338 0.8192 12800 2.3309 0.546
2.3313 0.8256 12900 2.3305 0.546
2.3304 0.832 13000 2.3301 0.546
2.33 0.8384 13100 2.3297 0.546
2.3282 0.8448 13200 2.3294 0.546
2.3291 0.8512 13300 2.3291 0.546
2.3282 0.8576 13400 2.3288 0.546
2.3295 0.864 13500 2.3286 0.546
2.3316 0.8704 13600 2.3284 0.546
2.3275 0.8768 13700 2.3281 0.546
2.3297 0.8832 13800 2.3280 0.546
2.329 0.8896 13900 2.3278 0.546
2.3297 0.896 14000 2.3276 0.546
2.3284 0.9024 14100 2.3275 0.546
2.3289 0.9088 14200 2.3274 0.546
2.3285 0.9152 14300 2.3273 0.546
2.3251 0.9216 14400 2.3272 0.546
2.3274 0.928 14500 2.3271 0.546
2.3273 0.9344 14600 2.3271 0.546
2.3279 0.9408 14700 2.3270 0.546
2.3251 0.9472 14800 2.3270 0.546
2.3248 0.9536 14900 2.3269 0.546
2.3239 0.96 15000 2.3269 0.546
2.3302 0.9664 15100 2.3269 0.546
2.3265 0.9728 15200 2.3269 0.546
2.3238 0.9792 15300 2.3269 0.546
2.3274 0.9856 15400 2.3269 0.546
2.3238 0.992 15500 2.3269 0.546
2.325 0.9984 15600 2.3269 0.546

Framework versions

  • Transformers 4.46.0
  • Pytorch 2.5.1
  • Datasets 3.1.0
  • Tokenizers 0.20.1
Downloads last month
2
Safetensors
Model size
214 params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support