Llama3-OpenBioLLM-8B-PsyCourse-fold10

This model is a fine-tuned version of aaditya/Llama3-OpenBioLLM-8B on the course-train-fold10 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0347

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.5221 0.0770 50 0.3092
0.0971 0.1539 100 0.0860
0.0873 0.2309 150 0.0607
0.0616 0.3078 200 0.0565
0.071 0.3848 250 0.0542
0.0615 0.4618 300 0.0497
0.0538 0.5387 350 0.0468
0.0532 0.6157 400 0.0462
0.0501 0.6926 450 0.0482
0.0575 0.7696 500 0.0422
0.0418 0.8466 550 0.0440
0.048 0.9235 600 0.0398
0.0559 1.0005 650 0.0397
0.0358 1.0774 700 0.0431
0.0277 1.1544 750 0.0392
0.029 1.2314 800 0.0376
0.0283 1.3083 850 0.0383
0.035 1.3853 900 0.0371
0.0367 1.4622 950 0.0373
0.0272 1.5392 1000 0.0428
0.0435 1.6162 1050 0.0367
0.0379 1.6931 1100 0.0368
0.0296 1.7701 1150 0.0378
0.0423 1.8470 1200 0.0377
0.0389 1.9240 1250 0.0347
0.0349 2.0010 1300 0.0378
0.0191 2.0779 1350 0.0376
0.0252 2.1549 1400 0.0371
0.016 2.2318 1450 0.0381
0.0211 2.3088 1500 0.0362
0.0223 2.3858 1550 0.0355
0.0227 2.4627 1600 0.0385
0.0268 2.5397 1650 0.0354
0.0267 2.6166 1700 0.0349
0.0158 2.6936 1750 0.0352
0.0186 2.7706 1800 0.0384
0.0155 2.8475 1850 0.0401
0.0158 2.9245 1900 0.0365
0.0185 3.0014 1950 0.0362
0.0103 3.0784 2000 0.0401
0.0111 3.1554 2050 0.0402
0.0105 3.2323 2100 0.0448
0.0077 3.3093 2150 0.0435
0.0078 3.3862 2200 0.0476
0.0072 3.4632 2250 0.0457
0.0118 3.5402 2300 0.0452
0.0107 3.6171 2350 0.0448
0.01 3.6941 2400 0.0478
0.0092 3.7710 2450 0.0471
0.0166 3.8480 2500 0.0437
0.0048 3.9250 2550 0.0444
0.0057 4.0019 2600 0.0454
0.0033 4.0789 2650 0.0484
0.0032 4.1558 2700 0.0500
0.005 4.2328 2750 0.0527
0.004 4.3098 2800 0.0546
0.0034 4.3867 2850 0.0554
0.0023 4.4637 2900 0.0560
0.0027 4.5406 2950 0.0564
0.0025 4.6176 3000 0.0563
0.0054 4.6946 3050 0.0568
0.0016 4.7715 3100 0.0569
0.0024 4.8485 3150 0.0567
0.0018 4.9254 3200 0.0568

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chchen/Llama3-OpenBioLLM-8B-PsyCourse-fold10

Adapter
(44)
this model