base-lora-hf

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.9672
  • Rouge1: 5.3819
  • Rouge2: 0.5196
  • Rougel: 4.7713
  • Rougelsum: 4.7866

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
19.9845 0.0251 5 20.2996 0.9748 0.2022 0.9325 0.9418
21.4138 0.0503 10 20.5804 0.9695 0.2108 0.9242 0.9322
22.1251 0.0754 15 20.5377 0.9484 0.2108 0.9059 0.9121
21.3367 0.1005 20 20.3529 0.9310 0.1873 0.8883 0.8910
21.8064 0.1256 25 20.1915 0.9299 0.1873 0.8920 0.8878
22.1628 0.1508 30 19.8701 0.9189 0.1867 0.8760 0.8783
19.1129 0.1759 35 19.4976 0.9189 0.1867 0.8760 0.8783
22.7727 0.2010 40 19.3954 0.8863 0.1693 0.8449 0.8450
19.4877 0.2261 45 19.3051 0.8695 0.1693 0.8311 0.8350
22.7237 0.2513 50 19.0373 0.9211 0.1867 0.8840 0.8873
20.0259 0.2764 55 18.8301 0.9075 0.1867 0.8753 0.8772
20.0452 0.3015 60 18.6446 0.9345 0.1867 0.8992 0.9021
19.9441 0.3266 65 18.5094 0.9493 0.1867 0.9210 0.9190
20.6445 0.3518 70 17.7888 0.9241 0.1867 0.8851 0.8835
20.2272 0.3769 75 17.7626 0.8845 0.1867 0.8603 0.8597
19.4512 0.4020 80 17.4225 0.8690 0.1693 0.8451 0.8463
18.6298 0.4271 85 17.2512 0.8690 0.1693 0.8451 0.8463
19.8135 0.4523 90 16.9832 0.8717 0.1693 0.8470 0.8476
18.1969 0.4774 95 16.5094 0.8603 0.1693 0.8375 0.8378
18.2965 0.5025 100 16.2330 0.8725 0.1693 0.8469 0.8480
17.7528 0.5276 105 15.9999 0.9351 0.2016 0.8999 0.9103
18.5468 0.5528 110 15.8200 0.9093 0.2005 0.8722 0.8864
18.5202 0.5779 115 15.2716 0.9639 0.2307 0.9213 0.9236
18.4827 0.6030 120 15.0205 1.0231 0.2307 0.9861 0.9923
18.4432 0.6281 125 14.2666 1.0696 0.2412 1.0329 1.0385
15.5714 0.6533 130 13.8502 1.0922 0.2793 1.0538 1.0591
16.6447 0.6784 135 13.3299 1.1199 0.2929 1.0721 1.0778
16.5493 0.7035 140 13.1941 1.2006 0.2819 1.1359 1.1420
16.5007 0.7286 145 13.0195 1.1336 0.2819 1.0792 1.0843
15.3121 0.7538 150 12.5816 1.0646 0.2355 1.0290 1.0348
15.9766 0.7789 155 12.2410 1.1067 0.2355 1.0669 1.0773
14.5082 0.8040 160 11.9994 1.1207 0.2355 1.0817 1.0943
14.812 0.8291 165 11.5947 1.0905 0.2344 1.0519 1.0659
15.0269 0.8543 170 11.3428 1.0979 0.2357 1.0521 1.0671
14.4106 0.8794 175 11.1049 1.1636 0.2357 1.1055 1.1181
14.1319 0.9045 180 10.7609 1.2215 0.2248 1.1438 1.1535
13.8952 0.9296 185 10.5520 1.2630 0.2248 1.1831 1.1883
12.7063 0.9548 190 10.4312 1.3229 0.2378 1.2448 1.2592
13.4554 0.9799 195 10.2339 1.3215 0.2384 1.2514 1.2582
14.6064 1.0050 200 9.9984 1.3350 0.2384 1.2627 1.2693
12.8018 1.0302 205 9.7809 1.3170 0.2350 1.2483 1.2603
12.6067 1.0553 210 9.5817 1.4603 0.2350 1.3532 1.3609
12.7822 1.0804 215 9.4338 1.5522 0.2447 1.4238 1.4276
12.596 1.1055 220 9.3207 1.5781 0.2436 1.4383 1.4388
11.9216 1.1307 225 9.1564 1.6240 0.2417 1.4953 1.4960
12.0841 1.1558 230 9.0333 1.7191 0.2437 1.5857 1.5886
11.5676 1.1809 235 8.9293 1.6431 0.2437 1.5272 1.5248
11.4117 1.2060 240 8.7542 1.6855 0.2447 1.5391 1.5390
11.1348 1.2312 245 8.5618 1.8631 0.2640 1.7019 1.7125
11.2796 1.2563 250 8.3880 1.8559 0.2379 1.7178 1.7262
10.7728 1.2814 255 8.2453 2.0466 0.2500 1.8301 1.8343
10.7193 1.3065 260 8.0746 2.0877 0.2297 1.8682 1.8809
10.3078 1.3317 265 7.9463 2.1213 0.2118 1.8659 1.8702
9.4906 1.3568 270 7.8111 2.2115 0.1872 1.9316 1.9337
9.7248 1.3819 275 7.7112 2.3184 0.1621 2.0219 2.0422
9.12 1.4070 280 7.6426 2.3553 0.1770 2.0438 2.0638
9.9925 1.4322 285 7.5105 2.2216 0.1630 1.9617 1.9819
9.2764 1.4573 290 7.4422 2.2106 0.1344 2.0391 2.0359
9.0597 1.4824 295 7.3792 2.1439 0.1053 2.0053 1.9983
8.5065 1.5075 300 7.3106 2.0833 0.1089 1.8708 1.8755
8.992 1.5327 305 7.2412 2.0063 0.1084 1.8417 1.8540
8.6069 1.5578 310 7.1944 1.9500 0.1077 1.7940 1.8023
9.0201 1.5829 315 7.1420 2.0260 0.0967 1.8499 1.8535
8.8578 1.6080 320 7.0990 2.1266 0.0860 1.9794 1.9809
8.8284 1.6332 325 7.0596 2.0635 0.0982 1.9106 1.9164
8.3734 1.6583 330 7.0340 1.9703 0.0852 1.8102 1.8107
8.6248 1.6834 335 6.9965 1.8375 0.0843 1.7519 1.7559
8.3218 1.7085 340 6.9770 1.8187 0.0720 1.7496 1.7501
8.3853 1.7337 345 6.9528 1.7610 0.0598 1.6486 1.6470
8.4904 1.7588 350 6.9318 1.6401 0.0486 1.5289 1.5242
8.2582 1.7839 355 6.9122 1.6295 0.0493 1.5232 1.5156
8.5878 1.8090 360 6.8823 1.6154 0.0258 1.5108 1.5050
7.7663 1.8342 365 6.8499 1.5827 0.0258 1.5094 1.5009
8.2 1.8593 370 6.8237 1.6436 0.0258 1.5411 1.5388
7.5589 1.8844 375 6.8067 1.6364 0.0258 1.5458 1.5432
8.1803 1.9095 380 6.7845 1.5375 0.0124 1.4493 1.4461
7.7374 1.9347 385 6.7645 1.5121 0.0124 1.4347 1.4303
7.8814 1.9598 390 6.7452 1.5152 0.0124 1.4376 1.4363
7.6648 1.9849 395 6.7257 1.5162 0.0124 1.4428 1.4339
7.6132 2.0101 400 6.7029 1.5577 0.0124 1.4781 1.4777
7.864 2.0352 405 6.6846 1.6281 0.0124 1.5464 1.5479
7.5629 2.0603 410 6.6705 1.6544 0.0124 1.5745 1.5703
7.6238 2.0854 415 6.6598 1.6077 0.0124 1.5295 1.5295
7.4898 2.1106 420 6.6395 1.6862 0.0124 1.6088 1.6062
7.8069 2.1357 425 6.6117 1.7010 0.0124 1.6381 1.6358
7.4597 2.1608 430 6.5884 1.7218 0.0124 1.6599 1.6582
7.3898 2.1859 435 6.5677 1.7601 0.0124 1.7094 1.7086
7.3911 2.2111 440 6.5463 1.8674 0.0124 1.8057 1.7979
7.1733 2.2362 445 6.5197 1.8953 0.0124 1.8350 1.8341
7.1887 2.2613 450 6.4986 1.9404 0.0124 1.8864 1.8842
7.0774 2.2864 455 6.4825 2.0663 0.0124 2.0050 2.0030
7.2954 2.3116 460 6.4684 2.1557 0.0124 2.0807 2.0742
7.0925 2.3367 465 6.4529 2.2422 0.0124 2.1712 2.1695
7.1943 2.3618 470 6.4348 2.3632 0.0124 2.2752 2.2679
7.0861 2.3869 475 6.4134 2.4368 0.0124 2.3484 2.3430
7.18 2.4121 480 6.3854 2.5631 0.0124 2.4796 2.4720
7.2415 2.4372 485 6.3554 2.6147 0.0124 2.5171 2.5113
7.0627 2.4623 490 6.3225 2.7857 0.0124 2.6762 2.6708
7.2473 2.4874 495 6.2878 2.8660 0.0 2.7604 2.7589
7.1432 2.5126 500 6.2584 3.0089 0.0 2.8942 2.9031
6.8786 2.5377 505 6.2204 3.0677 0.0 2.9803 2.9827
7.2605 2.5628 510 6.1884 3.1637 0.0 3.0661 3.0713
6.8372 2.5879 515 6.1583 3.2234 0.0 3.1330 3.1292
6.9582 2.6131 520 6.1281 3.4104 0.0 3.2578 3.2674
6.834 2.6382 525 6.0949 3.5180 0.0137 3.3648 3.3729
7.065 2.6633 530 6.0598 3.5609 0.0137 3.4115 3.4148
6.8041 2.6884 535 6.0301 3.6212 0.0263 3.4610 3.4601
7.0177 2.7136 540 6.0045 3.6462 0.0263 3.4834 3.4829
6.7565 2.7387 545 5.9789 3.6683 0.0263 3.5033 3.5079
6.6908 2.7638 550 5.9469 3.6961 0.0263 3.5384 3.5331
6.6701 2.7889 555 5.9122 3.8271 0.0398 3.6780 3.6719
6.703 2.8141 560 5.8798 3.9789 0.0399 3.8203 3.8054
6.7734 2.8392 565 5.8518 4.0644 0.0533 3.9002 3.9022
6.8523 2.8643 570 5.8196 4.0991 0.0653 3.9677 3.9648
6.477 2.8894 575 5.7907 4.1666 0.0914 4.0340 4.0302
6.8284 2.9146 580 5.7659 4.1567 0.0914 4.0088 4.0067
6.8174 2.9397 585 5.7360 4.2228 0.0914 4.0340 4.0372
6.7112 2.9648 590 5.7117 4.2814 0.0915 4.0661 4.0639
6.5568 2.9899 595 5.6804 4.3052 0.1413 4.0651 4.0724
6.4933 3.0151 600 5.6472 4.3145 0.1273 4.0867 4.0873
6.3 3.0402 605 5.5958 4.2657 0.1510 4.0182 4.0272
6.5645 3.0653 610 5.5545 4.2810 0.1695 4.0035 3.9989
6.5585 3.0905 615 5.5188 4.3714 0.1699 4.1262 4.1184
6.5803 3.1156 620 5.4954 4.3159 0.1574 4.0346 4.0352
6.3667 3.1407 625 5.4669 4.4380 0.1829 4.1416 4.1304
6.4128 3.1658 630 5.4418 4.4830 0.1828 4.1453 4.1436
6.2 3.1910 635 5.4183 4.5121 0.1953 4.1746 4.1750
6.2363 3.2161 640 5.3853 4.6840 0.2092 4.3241 4.3157
6.3394 3.2412 645 5.3572 4.6549 0.1969 4.3112 4.3093
6.3711 3.2663 650 5.3348 4.8189 0.2395 4.4461 4.4309
6.3356 3.2915 655 5.3154 4.7549 0.2282 4.3740 4.3551
6.2359 3.3166 660 5.2990 4.7922 0.2556 4.3994 4.3842
6.1759 3.3417 665 5.2830 4.9007 0.2765 4.4718 4.4659
6.5212 3.3668 670 5.2652 4.8672 0.2772 4.4498 4.4419
6.3572 3.3920 675 5.2523 4.8740 0.2921 4.4736 4.4664
6.142 3.4171 680 5.2297 4.8456 0.3048 4.4642 4.4639
6.2698 3.4422 685 5.2075 4.9942 0.3382 4.5893 4.5796
6.1898 3.4673 690 5.1786 5.0596 0.3377 4.6012 4.6013
6.1935 3.4925 695 5.1579 5.1421 0.3352 4.6368 4.6311
6.0207 3.5176 700 5.1373 5.1597 0.3198 4.6450 4.6314
6.2054 3.5427 705 5.1142 5.3065 0.3589 4.7477 4.7568
6.0647 3.5678 710 5.0953 5.3270 0.4192 4.7751 4.7847
6.2467 3.5930 715 5.0779 5.3485 0.4558 4.7980 4.8109
6.2021 3.6181 720 5.0642 5.3178 0.4558 4.7720 4.7920
6.1525 3.6432 725 5.0512 5.2626 0.4670 4.7375 4.7532
5.9856 3.6683 730 5.0375 5.3317 0.4597 4.7651 4.7658
6.3415 3.6935 735 5.0281 5.2917 0.4597 4.7487 4.7482
5.9981 3.7186 740 5.0204 5.2751 0.4597 4.7114 4.7088
6.1648 3.7437 745 5.0091 5.2866 0.4597 4.7184 4.7199
6.1724 3.7688 750 5.0011 5.3533 0.4597 4.7817 4.7842
6.0231 3.7940 755 4.9939 5.3533 0.4597 4.7817 4.7842
5.9608 3.8191 760 4.9870 5.4313 0.4952 4.8477 4.8522
6.1358 3.8442 765 4.9817 5.4523 0.5043 4.8543 4.8664
6.061 3.8693 770 4.9765 5.5196 0.5196 4.8860 4.8956
6.2685 3.8945 775 4.9726 5.4864 0.5196 4.8490 4.8594
6.0714 3.9196 780 4.9702 5.4314 0.5196 4.8078 4.8230
6.1423 3.9447 785 4.9686 5.3819 0.5196 4.7713 4.7866
5.931 3.9698 790 4.9676 5.4160 0.5196 4.8078 4.8230
6.148 3.9950 795 4.9672 5.3819 0.5196 4.7713 4.7866

Framework versions

  • PEFT 0.14.0
  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for benitoals/base-lora-hf

Base model

google/mt5-base
Adapter
(26)
this model