base-lora-hf
This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 4.9672
- Rouge1: 5.3819
- Rouge2: 0.5196
- Rougel: 4.7713
- Rougelsum: 4.7866
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 4
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|
19.9845 | 0.0251 | 5 | 20.2996 | 0.9748 | 0.2022 | 0.9325 | 0.9418 |
21.4138 | 0.0503 | 10 | 20.5804 | 0.9695 | 0.2108 | 0.9242 | 0.9322 |
22.1251 | 0.0754 | 15 | 20.5377 | 0.9484 | 0.2108 | 0.9059 | 0.9121 |
21.3367 | 0.1005 | 20 | 20.3529 | 0.9310 | 0.1873 | 0.8883 | 0.8910 |
21.8064 | 0.1256 | 25 | 20.1915 | 0.9299 | 0.1873 | 0.8920 | 0.8878 |
22.1628 | 0.1508 | 30 | 19.8701 | 0.9189 | 0.1867 | 0.8760 | 0.8783 |
19.1129 | 0.1759 | 35 | 19.4976 | 0.9189 | 0.1867 | 0.8760 | 0.8783 |
22.7727 | 0.2010 | 40 | 19.3954 | 0.8863 | 0.1693 | 0.8449 | 0.8450 |
19.4877 | 0.2261 | 45 | 19.3051 | 0.8695 | 0.1693 | 0.8311 | 0.8350 |
22.7237 | 0.2513 | 50 | 19.0373 | 0.9211 | 0.1867 | 0.8840 | 0.8873 |
20.0259 | 0.2764 | 55 | 18.8301 | 0.9075 | 0.1867 | 0.8753 | 0.8772 |
20.0452 | 0.3015 | 60 | 18.6446 | 0.9345 | 0.1867 | 0.8992 | 0.9021 |
19.9441 | 0.3266 | 65 | 18.5094 | 0.9493 | 0.1867 | 0.9210 | 0.9190 |
20.6445 | 0.3518 | 70 | 17.7888 | 0.9241 | 0.1867 | 0.8851 | 0.8835 |
20.2272 | 0.3769 | 75 | 17.7626 | 0.8845 | 0.1867 | 0.8603 | 0.8597 |
19.4512 | 0.4020 | 80 | 17.4225 | 0.8690 | 0.1693 | 0.8451 | 0.8463 |
18.6298 | 0.4271 | 85 | 17.2512 | 0.8690 | 0.1693 | 0.8451 | 0.8463 |
19.8135 | 0.4523 | 90 | 16.9832 | 0.8717 | 0.1693 | 0.8470 | 0.8476 |
18.1969 | 0.4774 | 95 | 16.5094 | 0.8603 | 0.1693 | 0.8375 | 0.8378 |
18.2965 | 0.5025 | 100 | 16.2330 | 0.8725 | 0.1693 | 0.8469 | 0.8480 |
17.7528 | 0.5276 | 105 | 15.9999 | 0.9351 | 0.2016 | 0.8999 | 0.9103 |
18.5468 | 0.5528 | 110 | 15.8200 | 0.9093 | 0.2005 | 0.8722 | 0.8864 |
18.5202 | 0.5779 | 115 | 15.2716 | 0.9639 | 0.2307 | 0.9213 | 0.9236 |
18.4827 | 0.6030 | 120 | 15.0205 | 1.0231 | 0.2307 | 0.9861 | 0.9923 |
18.4432 | 0.6281 | 125 | 14.2666 | 1.0696 | 0.2412 | 1.0329 | 1.0385 |
15.5714 | 0.6533 | 130 | 13.8502 | 1.0922 | 0.2793 | 1.0538 | 1.0591 |
16.6447 | 0.6784 | 135 | 13.3299 | 1.1199 | 0.2929 | 1.0721 | 1.0778 |
16.5493 | 0.7035 | 140 | 13.1941 | 1.2006 | 0.2819 | 1.1359 | 1.1420 |
16.5007 | 0.7286 | 145 | 13.0195 | 1.1336 | 0.2819 | 1.0792 | 1.0843 |
15.3121 | 0.7538 | 150 | 12.5816 | 1.0646 | 0.2355 | 1.0290 | 1.0348 |
15.9766 | 0.7789 | 155 | 12.2410 | 1.1067 | 0.2355 | 1.0669 | 1.0773 |
14.5082 | 0.8040 | 160 | 11.9994 | 1.1207 | 0.2355 | 1.0817 | 1.0943 |
14.812 | 0.8291 | 165 | 11.5947 | 1.0905 | 0.2344 | 1.0519 | 1.0659 |
15.0269 | 0.8543 | 170 | 11.3428 | 1.0979 | 0.2357 | 1.0521 | 1.0671 |
14.4106 | 0.8794 | 175 | 11.1049 | 1.1636 | 0.2357 | 1.1055 | 1.1181 |
14.1319 | 0.9045 | 180 | 10.7609 | 1.2215 | 0.2248 | 1.1438 | 1.1535 |
13.8952 | 0.9296 | 185 | 10.5520 | 1.2630 | 0.2248 | 1.1831 | 1.1883 |
12.7063 | 0.9548 | 190 | 10.4312 | 1.3229 | 0.2378 | 1.2448 | 1.2592 |
13.4554 | 0.9799 | 195 | 10.2339 | 1.3215 | 0.2384 | 1.2514 | 1.2582 |
14.6064 | 1.0050 | 200 | 9.9984 | 1.3350 | 0.2384 | 1.2627 | 1.2693 |
12.8018 | 1.0302 | 205 | 9.7809 | 1.3170 | 0.2350 | 1.2483 | 1.2603 |
12.6067 | 1.0553 | 210 | 9.5817 | 1.4603 | 0.2350 | 1.3532 | 1.3609 |
12.7822 | 1.0804 | 215 | 9.4338 | 1.5522 | 0.2447 | 1.4238 | 1.4276 |
12.596 | 1.1055 | 220 | 9.3207 | 1.5781 | 0.2436 | 1.4383 | 1.4388 |
11.9216 | 1.1307 | 225 | 9.1564 | 1.6240 | 0.2417 | 1.4953 | 1.4960 |
12.0841 | 1.1558 | 230 | 9.0333 | 1.7191 | 0.2437 | 1.5857 | 1.5886 |
11.5676 | 1.1809 | 235 | 8.9293 | 1.6431 | 0.2437 | 1.5272 | 1.5248 |
11.4117 | 1.2060 | 240 | 8.7542 | 1.6855 | 0.2447 | 1.5391 | 1.5390 |
11.1348 | 1.2312 | 245 | 8.5618 | 1.8631 | 0.2640 | 1.7019 | 1.7125 |
11.2796 | 1.2563 | 250 | 8.3880 | 1.8559 | 0.2379 | 1.7178 | 1.7262 |
10.7728 | 1.2814 | 255 | 8.2453 | 2.0466 | 0.2500 | 1.8301 | 1.8343 |
10.7193 | 1.3065 | 260 | 8.0746 | 2.0877 | 0.2297 | 1.8682 | 1.8809 |
10.3078 | 1.3317 | 265 | 7.9463 | 2.1213 | 0.2118 | 1.8659 | 1.8702 |
9.4906 | 1.3568 | 270 | 7.8111 | 2.2115 | 0.1872 | 1.9316 | 1.9337 |
9.7248 | 1.3819 | 275 | 7.7112 | 2.3184 | 0.1621 | 2.0219 | 2.0422 |
9.12 | 1.4070 | 280 | 7.6426 | 2.3553 | 0.1770 | 2.0438 | 2.0638 |
9.9925 | 1.4322 | 285 | 7.5105 | 2.2216 | 0.1630 | 1.9617 | 1.9819 |
9.2764 | 1.4573 | 290 | 7.4422 | 2.2106 | 0.1344 | 2.0391 | 2.0359 |
9.0597 | 1.4824 | 295 | 7.3792 | 2.1439 | 0.1053 | 2.0053 | 1.9983 |
8.5065 | 1.5075 | 300 | 7.3106 | 2.0833 | 0.1089 | 1.8708 | 1.8755 |
8.992 | 1.5327 | 305 | 7.2412 | 2.0063 | 0.1084 | 1.8417 | 1.8540 |
8.6069 | 1.5578 | 310 | 7.1944 | 1.9500 | 0.1077 | 1.7940 | 1.8023 |
9.0201 | 1.5829 | 315 | 7.1420 | 2.0260 | 0.0967 | 1.8499 | 1.8535 |
8.8578 | 1.6080 | 320 | 7.0990 | 2.1266 | 0.0860 | 1.9794 | 1.9809 |
8.8284 | 1.6332 | 325 | 7.0596 | 2.0635 | 0.0982 | 1.9106 | 1.9164 |
8.3734 | 1.6583 | 330 | 7.0340 | 1.9703 | 0.0852 | 1.8102 | 1.8107 |
8.6248 | 1.6834 | 335 | 6.9965 | 1.8375 | 0.0843 | 1.7519 | 1.7559 |
8.3218 | 1.7085 | 340 | 6.9770 | 1.8187 | 0.0720 | 1.7496 | 1.7501 |
8.3853 | 1.7337 | 345 | 6.9528 | 1.7610 | 0.0598 | 1.6486 | 1.6470 |
8.4904 | 1.7588 | 350 | 6.9318 | 1.6401 | 0.0486 | 1.5289 | 1.5242 |
8.2582 | 1.7839 | 355 | 6.9122 | 1.6295 | 0.0493 | 1.5232 | 1.5156 |
8.5878 | 1.8090 | 360 | 6.8823 | 1.6154 | 0.0258 | 1.5108 | 1.5050 |
7.7663 | 1.8342 | 365 | 6.8499 | 1.5827 | 0.0258 | 1.5094 | 1.5009 |
8.2 | 1.8593 | 370 | 6.8237 | 1.6436 | 0.0258 | 1.5411 | 1.5388 |
7.5589 | 1.8844 | 375 | 6.8067 | 1.6364 | 0.0258 | 1.5458 | 1.5432 |
8.1803 | 1.9095 | 380 | 6.7845 | 1.5375 | 0.0124 | 1.4493 | 1.4461 |
7.7374 | 1.9347 | 385 | 6.7645 | 1.5121 | 0.0124 | 1.4347 | 1.4303 |
7.8814 | 1.9598 | 390 | 6.7452 | 1.5152 | 0.0124 | 1.4376 | 1.4363 |
7.6648 | 1.9849 | 395 | 6.7257 | 1.5162 | 0.0124 | 1.4428 | 1.4339 |
7.6132 | 2.0101 | 400 | 6.7029 | 1.5577 | 0.0124 | 1.4781 | 1.4777 |
7.864 | 2.0352 | 405 | 6.6846 | 1.6281 | 0.0124 | 1.5464 | 1.5479 |
7.5629 | 2.0603 | 410 | 6.6705 | 1.6544 | 0.0124 | 1.5745 | 1.5703 |
7.6238 | 2.0854 | 415 | 6.6598 | 1.6077 | 0.0124 | 1.5295 | 1.5295 |
7.4898 | 2.1106 | 420 | 6.6395 | 1.6862 | 0.0124 | 1.6088 | 1.6062 |
7.8069 | 2.1357 | 425 | 6.6117 | 1.7010 | 0.0124 | 1.6381 | 1.6358 |
7.4597 | 2.1608 | 430 | 6.5884 | 1.7218 | 0.0124 | 1.6599 | 1.6582 |
7.3898 | 2.1859 | 435 | 6.5677 | 1.7601 | 0.0124 | 1.7094 | 1.7086 |
7.3911 | 2.2111 | 440 | 6.5463 | 1.8674 | 0.0124 | 1.8057 | 1.7979 |
7.1733 | 2.2362 | 445 | 6.5197 | 1.8953 | 0.0124 | 1.8350 | 1.8341 |
7.1887 | 2.2613 | 450 | 6.4986 | 1.9404 | 0.0124 | 1.8864 | 1.8842 |
7.0774 | 2.2864 | 455 | 6.4825 | 2.0663 | 0.0124 | 2.0050 | 2.0030 |
7.2954 | 2.3116 | 460 | 6.4684 | 2.1557 | 0.0124 | 2.0807 | 2.0742 |
7.0925 | 2.3367 | 465 | 6.4529 | 2.2422 | 0.0124 | 2.1712 | 2.1695 |
7.1943 | 2.3618 | 470 | 6.4348 | 2.3632 | 0.0124 | 2.2752 | 2.2679 |
7.0861 | 2.3869 | 475 | 6.4134 | 2.4368 | 0.0124 | 2.3484 | 2.3430 |
7.18 | 2.4121 | 480 | 6.3854 | 2.5631 | 0.0124 | 2.4796 | 2.4720 |
7.2415 | 2.4372 | 485 | 6.3554 | 2.6147 | 0.0124 | 2.5171 | 2.5113 |
7.0627 | 2.4623 | 490 | 6.3225 | 2.7857 | 0.0124 | 2.6762 | 2.6708 |
7.2473 | 2.4874 | 495 | 6.2878 | 2.8660 | 0.0 | 2.7604 | 2.7589 |
7.1432 | 2.5126 | 500 | 6.2584 | 3.0089 | 0.0 | 2.8942 | 2.9031 |
6.8786 | 2.5377 | 505 | 6.2204 | 3.0677 | 0.0 | 2.9803 | 2.9827 |
7.2605 | 2.5628 | 510 | 6.1884 | 3.1637 | 0.0 | 3.0661 | 3.0713 |
6.8372 | 2.5879 | 515 | 6.1583 | 3.2234 | 0.0 | 3.1330 | 3.1292 |
6.9582 | 2.6131 | 520 | 6.1281 | 3.4104 | 0.0 | 3.2578 | 3.2674 |
6.834 | 2.6382 | 525 | 6.0949 | 3.5180 | 0.0137 | 3.3648 | 3.3729 |
7.065 | 2.6633 | 530 | 6.0598 | 3.5609 | 0.0137 | 3.4115 | 3.4148 |
6.8041 | 2.6884 | 535 | 6.0301 | 3.6212 | 0.0263 | 3.4610 | 3.4601 |
7.0177 | 2.7136 | 540 | 6.0045 | 3.6462 | 0.0263 | 3.4834 | 3.4829 |
6.7565 | 2.7387 | 545 | 5.9789 | 3.6683 | 0.0263 | 3.5033 | 3.5079 |
6.6908 | 2.7638 | 550 | 5.9469 | 3.6961 | 0.0263 | 3.5384 | 3.5331 |
6.6701 | 2.7889 | 555 | 5.9122 | 3.8271 | 0.0398 | 3.6780 | 3.6719 |
6.703 | 2.8141 | 560 | 5.8798 | 3.9789 | 0.0399 | 3.8203 | 3.8054 |
6.7734 | 2.8392 | 565 | 5.8518 | 4.0644 | 0.0533 | 3.9002 | 3.9022 |
6.8523 | 2.8643 | 570 | 5.8196 | 4.0991 | 0.0653 | 3.9677 | 3.9648 |
6.477 | 2.8894 | 575 | 5.7907 | 4.1666 | 0.0914 | 4.0340 | 4.0302 |
6.8284 | 2.9146 | 580 | 5.7659 | 4.1567 | 0.0914 | 4.0088 | 4.0067 |
6.8174 | 2.9397 | 585 | 5.7360 | 4.2228 | 0.0914 | 4.0340 | 4.0372 |
6.7112 | 2.9648 | 590 | 5.7117 | 4.2814 | 0.0915 | 4.0661 | 4.0639 |
6.5568 | 2.9899 | 595 | 5.6804 | 4.3052 | 0.1413 | 4.0651 | 4.0724 |
6.4933 | 3.0151 | 600 | 5.6472 | 4.3145 | 0.1273 | 4.0867 | 4.0873 |
6.3 | 3.0402 | 605 | 5.5958 | 4.2657 | 0.1510 | 4.0182 | 4.0272 |
6.5645 | 3.0653 | 610 | 5.5545 | 4.2810 | 0.1695 | 4.0035 | 3.9989 |
6.5585 | 3.0905 | 615 | 5.5188 | 4.3714 | 0.1699 | 4.1262 | 4.1184 |
6.5803 | 3.1156 | 620 | 5.4954 | 4.3159 | 0.1574 | 4.0346 | 4.0352 |
6.3667 | 3.1407 | 625 | 5.4669 | 4.4380 | 0.1829 | 4.1416 | 4.1304 |
6.4128 | 3.1658 | 630 | 5.4418 | 4.4830 | 0.1828 | 4.1453 | 4.1436 |
6.2 | 3.1910 | 635 | 5.4183 | 4.5121 | 0.1953 | 4.1746 | 4.1750 |
6.2363 | 3.2161 | 640 | 5.3853 | 4.6840 | 0.2092 | 4.3241 | 4.3157 |
6.3394 | 3.2412 | 645 | 5.3572 | 4.6549 | 0.1969 | 4.3112 | 4.3093 |
6.3711 | 3.2663 | 650 | 5.3348 | 4.8189 | 0.2395 | 4.4461 | 4.4309 |
6.3356 | 3.2915 | 655 | 5.3154 | 4.7549 | 0.2282 | 4.3740 | 4.3551 |
6.2359 | 3.3166 | 660 | 5.2990 | 4.7922 | 0.2556 | 4.3994 | 4.3842 |
6.1759 | 3.3417 | 665 | 5.2830 | 4.9007 | 0.2765 | 4.4718 | 4.4659 |
6.5212 | 3.3668 | 670 | 5.2652 | 4.8672 | 0.2772 | 4.4498 | 4.4419 |
6.3572 | 3.3920 | 675 | 5.2523 | 4.8740 | 0.2921 | 4.4736 | 4.4664 |
6.142 | 3.4171 | 680 | 5.2297 | 4.8456 | 0.3048 | 4.4642 | 4.4639 |
6.2698 | 3.4422 | 685 | 5.2075 | 4.9942 | 0.3382 | 4.5893 | 4.5796 |
6.1898 | 3.4673 | 690 | 5.1786 | 5.0596 | 0.3377 | 4.6012 | 4.6013 |
6.1935 | 3.4925 | 695 | 5.1579 | 5.1421 | 0.3352 | 4.6368 | 4.6311 |
6.0207 | 3.5176 | 700 | 5.1373 | 5.1597 | 0.3198 | 4.6450 | 4.6314 |
6.2054 | 3.5427 | 705 | 5.1142 | 5.3065 | 0.3589 | 4.7477 | 4.7568 |
6.0647 | 3.5678 | 710 | 5.0953 | 5.3270 | 0.4192 | 4.7751 | 4.7847 |
6.2467 | 3.5930 | 715 | 5.0779 | 5.3485 | 0.4558 | 4.7980 | 4.8109 |
6.2021 | 3.6181 | 720 | 5.0642 | 5.3178 | 0.4558 | 4.7720 | 4.7920 |
6.1525 | 3.6432 | 725 | 5.0512 | 5.2626 | 0.4670 | 4.7375 | 4.7532 |
5.9856 | 3.6683 | 730 | 5.0375 | 5.3317 | 0.4597 | 4.7651 | 4.7658 |
6.3415 | 3.6935 | 735 | 5.0281 | 5.2917 | 0.4597 | 4.7487 | 4.7482 |
5.9981 | 3.7186 | 740 | 5.0204 | 5.2751 | 0.4597 | 4.7114 | 4.7088 |
6.1648 | 3.7437 | 745 | 5.0091 | 5.2866 | 0.4597 | 4.7184 | 4.7199 |
6.1724 | 3.7688 | 750 | 5.0011 | 5.3533 | 0.4597 | 4.7817 | 4.7842 |
6.0231 | 3.7940 | 755 | 4.9939 | 5.3533 | 0.4597 | 4.7817 | 4.7842 |
5.9608 | 3.8191 | 760 | 4.9870 | 5.4313 | 0.4952 | 4.8477 | 4.8522 |
6.1358 | 3.8442 | 765 | 4.9817 | 5.4523 | 0.5043 | 4.8543 | 4.8664 |
6.061 | 3.8693 | 770 | 4.9765 | 5.5196 | 0.5196 | 4.8860 | 4.8956 |
6.2685 | 3.8945 | 775 | 4.9726 | 5.4864 | 0.5196 | 4.8490 | 4.8594 |
6.0714 | 3.9196 | 780 | 4.9702 | 5.4314 | 0.5196 | 4.8078 | 4.8230 |
6.1423 | 3.9447 | 785 | 4.9686 | 5.3819 | 0.5196 | 4.7713 | 4.7866 |
5.931 | 3.9698 | 790 | 4.9676 | 5.4160 | 0.5196 | 4.8078 | 4.8230 |
6.148 | 3.9950 | 795 | 4.9672 | 5.3819 | 0.5196 | 4.7713 | 4.7866 |
Framework versions
- PEFT 0.14.0
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for benitoals/base-lora-hf
Base model
google/mt5-base