Llama3-OpenBioLLM-8B-PsyCourse-fold9

This model is a fine-tuned version of aaditya/Llama3-OpenBioLLM-8B on the course-train-fold9 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0367

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.5046 0.0768 50 0.3063
0.1059 0.1535 100 0.0842
0.0783 0.2303 150 0.0695
0.0635 0.3070 200 0.0595
0.075 0.3838 250 0.0530
0.065 0.4606 300 0.0491
0.0474 0.5373 350 0.0478
0.0461 0.6141 400 0.0493
0.0533 0.6908 450 0.0540
0.048 0.7676 500 0.0457
0.0694 0.8444 550 0.0475
0.0396 0.9211 600 0.0416
0.0412 0.9979 650 0.0386
0.0339 1.0746 700 0.0457
0.0357 1.1514 750 0.0434
0.0336 1.2282 800 0.0408
0.0342 1.3049 850 0.0414
0.0307 1.3817 900 0.0407
0.0312 1.4585 950 0.0379
0.0314 1.5352 1000 0.0392
0.0229 1.6120 1050 0.0367
0.0337 1.6887 1100 0.0372
0.028 1.7655 1150 0.0379
0.0191 1.8423 1200 0.0388
0.0348 1.9190 1250 0.0411
0.0469 1.9958 1300 0.0399
0.0193 2.0725 1350 0.0412
0.0168 2.1493 1400 0.0416
0.019 2.2261 1450 0.0390
0.0268 2.3028 1500 0.0390
0.0221 2.3796 1550 0.0412
0.0264 2.4563 1600 0.0408
0.0248 2.5331 1650 0.0390
0.018 2.6099 1700 0.0397
0.0148 2.6866 1750 0.0406
0.0228 2.7634 1800 0.0416
0.0216 2.8401 1850 0.0392
0.021 2.9169 1900 0.0396
0.016 2.9937 1950 0.0393
0.0055 3.0704 2000 0.0446
0.0128 3.1472 2050 0.0464
0.0105 3.2239 2100 0.0466
0.009 3.3007 2150 0.0450
0.0087 3.3775 2200 0.0487
0.0102 3.4542 2250 0.0473
0.007 3.5310 2300 0.0486
0.0113 3.6078 2350 0.0490
0.0066 3.6845 2400 0.0522
0.0064 3.7613 2450 0.0510
0.0095 3.8380 2500 0.0514
0.0089 3.9148 2550 0.0521
0.0065 3.9916 2600 0.0524
0.0034 4.0683 2650 0.0540
0.0032 4.1451 2700 0.0563
0.0026 4.2218 2750 0.0564
0.0024 4.2986 2800 0.0586
0.0021 4.3754 2850 0.0595
0.0043 4.4521 2900 0.0604
0.0019 4.5289 2950 0.0607
0.0011 4.6056 3000 0.0610
0.0018 4.6824 3050 0.0617
0.0051 4.7592 3100 0.0614
0.0032 4.8359 3150 0.0617
0.001 4.9127 3200 0.0617
0.0029 4.9894 3250 0.0618

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chchen/Llama3-OpenBioLLM-8B-PsyCourse-fold9

Adapter
(44)
this model