Llama3-OpenBioLLM-8B-PsyCourse-fold7

This model is a fine-tuned version of aaditya/Llama3-OpenBioLLM-8B on the course-train-fold7 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0343

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.5143 0.0764 50 0.3069
0.1115 0.1528 100 0.0800
0.0781 0.2292 150 0.0636
0.0677 0.3056 200 0.0595
0.0609 0.3820 250 0.0500
0.0442 0.4584 300 0.0519
0.0605 0.5348 350 0.0530
0.0415 0.6112 400 0.0427
0.0616 0.6875 450 0.0430
0.0412 0.7639 500 0.0389
0.0306 0.8403 550 0.0385
0.0579 0.9167 600 0.0378
0.0625 0.9931 650 0.0424
0.0363 1.0695 700 0.0368
0.024 1.1459 750 0.0377
0.0331 1.2223 800 0.0374
0.0368 1.2987 850 0.0383
0.039 1.3751 900 0.0359
0.0428 1.4515 950 0.0387
0.0285 1.5279 1000 0.0350
0.0292 1.6043 1050 0.0368
0.0317 1.6807 1100 0.0365
0.0506 1.7571 1150 0.0349
0.0329 1.8335 1200 0.0353
0.0352 1.9099 1250 0.0377
0.0289 1.9862 1300 0.0365
0.0202 2.0626 1350 0.0356
0.0174 2.1390 1400 0.0357
0.0134 2.2154 1450 0.0395
0.02 2.2918 1500 0.0361
0.0189 2.3682 1550 0.0374
0.0162 2.4446 1600 0.0348
0.0252 2.5210 1650 0.0371
0.0175 2.5974 1700 0.0366
0.0222 2.6738 1750 0.0346
0.0274 2.7502 1800 0.0347
0.0215 2.8266 1850 0.0362
0.0201 2.9030 1900 0.0378
0.016 2.9794 1950 0.0343
0.009 3.0558 2000 0.0372
0.0106 3.1322 2050 0.0389
0.0061 3.2086 2100 0.0432
0.0075 3.2850 2150 0.0434
0.0089 3.3613 2200 0.0434
0.0102 3.4377 2250 0.0462
0.0083 3.5141 2300 0.0465
0.0131 3.5905 2350 0.0443
0.0054 3.6669 2400 0.0424
0.0038 3.7433 2450 0.0428
0.0074 3.8197 2500 0.0429
0.0056 3.8961 2550 0.0426
0.007 3.9725 2600 0.0428
0.0034 4.0489 2650 0.0434
0.0051 4.1253 2700 0.0460
0.0043 4.2017 2750 0.0464
0.0023 4.2781 2800 0.0472
0.0035 4.3545 2850 0.0477
0.0021 4.4309 2900 0.0488
0.0021 4.5073 2950 0.0497
0.0024 4.5837 3000 0.0501
0.0013 4.6600 3050 0.0505
0.0031 4.7364 3100 0.0511
0.0031 4.8128 3150 0.0511
0.0023 4.8892 3200 0.0512
0.0023 4.9656 3250 0.0511

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chchen/Llama3-OpenBioLLM-8B-PsyCourse-fold7

Adapter
(44)
this model