Llama3-OpenBioLLM-8B-PsyCourse-fold2

This model is a fine-tuned version of aaditya/Llama3-OpenBioLLM-8B on the course-train-fold2 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0346

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.4179 0.0775 50 0.3434
0.0933 0.1550 100 0.0820
0.0662 0.2326 150 0.0668
0.0584 0.3101 200 0.0589
0.0666 0.3876 250 0.0527
0.0448 0.4651 300 0.0521
0.0474 0.5426 350 0.0490
0.0546 0.6202 400 0.0431
0.0432 0.6977 450 0.0393
0.0526 0.7752 500 0.0401
0.0506 0.8527 550 0.0400
0.0622 0.9302 600 0.0419
0.0363 1.0078 650 0.0380
0.032 1.0853 700 0.0377
0.0361 1.1628 750 0.0435
0.0256 1.2403 800 0.0365
0.0361 1.3178 850 0.0357
0.0428 1.3953 900 0.0369
0.0423 1.4729 950 0.0367
0.0298 1.5504 1000 0.0382
0.0357 1.6279 1050 0.0366
0.0271 1.7054 1100 0.0375
0.0325 1.7829 1150 0.0370
0.0328 1.8605 1200 0.0346
0.0373 1.9380 1250 0.0346
0.0219 2.0155 1300 0.0351
0.0179 2.0930 1350 0.0380
0.018 2.1705 1400 0.0398
0.0203 2.2481 1450 0.0382
0.0257 2.3256 1500 0.0405
0.0165 2.4031 1550 0.0382
0.0212 2.4806 1600 0.0375
0.0315 2.5581 1650 0.0373
0.0155 2.6357 1700 0.0379
0.0188 2.7132 1750 0.0379
0.0195 2.7907 1800 0.0397
0.0213 2.8682 1850 0.0373
0.0171 2.9457 1900 0.0374
0.0108 3.0233 1950 0.0390
0.0125 3.1008 2000 0.0437
0.0046 3.1783 2050 0.0459
0.0059 3.2558 2100 0.0479
0.0088 3.3333 2150 0.0432
0.0074 3.4109 2200 0.0455
0.0105 3.4884 2250 0.0493
0.0116 3.5659 2300 0.0510
0.01 3.6434 2350 0.0481
0.0126 3.7209 2400 0.0474
0.0061 3.7984 2450 0.0477
0.0088 3.8760 2500 0.0487
0.0074 3.9535 2550 0.0488
0.0076 4.0310 2600 0.0499
0.0051 4.1085 2650 0.0524
0.0038 4.1860 2700 0.0556
0.0031 4.2636 2750 0.0584
0.0028 4.3411 2800 0.0602
0.0037 4.4186 2850 0.0612
0.0037 4.4961 2900 0.0620
0.0013 4.5736 2950 0.0626
0.0013 4.6512 3000 0.0631
0.0023 4.7287 3050 0.0634
0.0042 4.8062 3100 0.0635
0.0053 4.8837 3150 0.0635
0.0041 4.9612 3200 0.0634

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chchen/Llama3-OpenBioLLM-8B-PsyCourse-fold2

Adapter
(44)
this model