Llama3-OpenBioLLM-8B-PsyCourse-fold8

This model is a fine-tuned version of aaditya/Llama3-OpenBioLLM-8B on the course-train-fold8 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0360

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.5651 0.0758 50 0.3454
0.159 0.1516 100 0.0916
0.0775 0.2275 150 0.0660
0.0565 0.3033 200 0.0587
0.0563 0.3791 250 0.0594
0.0631 0.4549 300 0.0575
0.0677 0.5308 350 0.0503
0.0366 0.6066 400 0.0471
0.0397 0.6824 450 0.0430
0.0383 0.7582 500 0.0479
0.0508 0.8340 550 0.0427
0.0346 0.9099 600 0.0434
0.0513 0.9857 650 0.0444
0.0339 1.0615 700 0.0417
0.0296 1.1373 750 0.0442
0.0288 1.2132 800 0.0397
0.0299 1.2890 850 0.0421
0.0293 1.3648 900 0.0401
0.0278 1.4406 950 0.0393
0.0283 1.5164 1000 0.0405
0.0493 1.5923 1050 0.0393
0.0287 1.6681 1100 0.0392
0.0383 1.7439 1150 0.0379
0.0312 1.8197 1200 0.0378
0.0353 1.8956 1250 0.0379
0.0242 1.9714 1300 0.0360
0.0176 2.0472 1350 0.0413
0.0132 2.1230 1400 0.0386
0.0224 2.1988 1450 0.0413
0.0198 2.2747 1500 0.0423
0.0191 2.3505 1550 0.0429
0.017 2.4263 1600 0.0412
0.0194 2.5021 1650 0.0465
0.0178 2.5780 1700 0.0439
0.0238 2.6538 1750 0.0411
0.0181 2.7296 1800 0.0414
0.0128 2.8054 1850 0.0439
0.0287 2.8812 1900 0.0410
0.0202 2.9571 1950 0.0418
0.011 3.0329 2000 0.0430
0.005 3.1087 2050 0.0487
0.0045 3.1845 2100 0.0502
0.0072 3.2604 2150 0.0496
0.0098 3.3362 2200 0.0482
0.0089 3.4120 2250 0.0492
0.0072 3.4878 2300 0.0486
0.0116 3.5636 2350 0.0496
0.0094 3.6395 2400 0.0489
0.0055 3.7153 2450 0.0501
0.0095 3.7911 2500 0.0529
0.0113 3.8669 2550 0.0517
0.0042 3.9428 2600 0.0518
0.0021 4.0186 2650 0.0539
0.0027 4.0944 2700 0.0573
0.0017 4.1702 2750 0.0590
0.0033 4.2460 2800 0.0603
0.003 4.3219 2850 0.0618
0.0013 4.3977 2900 0.0623
0.003 4.4735 2950 0.0625
0.0036 4.5493 3000 0.0631
0.0017 4.6252 3050 0.0634
0.0023 4.7010 3100 0.0635
0.0028 4.7768 3150 0.0635
0.0028 4.8526 3200 0.0637
0.0021 4.9284 3250 0.0636

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chchen/Llama3-OpenBioLLM-8B-PsyCourse-fold8

Adapter
(44)
this model