mt5-small-finetune-finetuned-research-papers-XX

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.5998
  • Rouge1: 36.5831
  • Rouge2: 17.8222
  • Rougel: 32.0591
  • Rougelsum: 32.1426
  • Gen Len: 16.0415

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5.6e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.6546 0.499 499 3.0145 27.7249 12.6467 24.9735 24.9728 12.4275
3.7935 0.998 998 2.7804 35.8013 17.0498 31.3295 31.3783 15.984
3.4614 1.4970 1497 2.6818 35.7213 17.1243 31.4956 31.552 15.3485
3.2954 1.996 1996 2.6486 35.5961 17.2535 31.3263 31.411 15.857
3.1932 2.495 2495 2.6300 36.5296 17.8491 32.1923 32.2628 15.8925
3.1508 2.9940 2994 2.6121 36.4565 17.6813 32.013 32.0796 16.0425
3.0782 3.493 3493 2.6094 36.4064 17.7208 31.9757 32.0421 16.0315
3.1005 3.992 3992 2.5998 36.5831 17.8222 32.0591 32.1426 16.0415

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.1
  • Tokenizers 0.21.1
Downloads last month
7
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Mug3n24/mt5-small-finetune-finetuned-research-papers-XX

Base model

google/mt5-small
Finetuned
(465)
this model