m2m100_418M-ft-cy-to-en

This model is a fine-tuned version of facebook/m2m100_418M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8103
  • Bleu: 38.1416
  • Gen Len: 30.6116

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 6000
  • training_steps: 30000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.1452 0.0166 2000 1.8458 16.5803 33.8627
1.6103 0.0332 4000 1.3438 24.8191 32.3474
1.3416 0.0499 6000 1.1426 30.3046 31.0528
1.2254 0.0665 8000 1.0307 28.6069 33.3047
1.1184 0.0831 10000 0.9745 34.3431 30.2578
1.0831 0.0997 12000 0.9310 33.0402 31.6276
1.0426 0.1164 14000 0.8990 36.4181 29.894
1.018 0.1330 16000 0.8783 37.2119 29.5248
0.9957 0.1496 18000 0.8601 37.9897 30.2633
0.9796 0.1662 20000 0.8485 37.5805 31.3311
0.9687 0.1829 22000 0.8378 39.4133 29.8813
0.9568 0.1995 24000 0.8270 38.5896 30.7235
0.9377 0.2161 26000 0.8169 38.7364 30.7135
0.942 0.2327 28000 0.8125 37.8857 30.7208
0.9427 0.2494 30000 0.8103 38.1416 30.6116

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
0
Safetensors
Model size
484M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DewiBrynJones/m2m100_418M-ft-cy-to-en

Finetuned
(95)
this model

Collection including DewiBrynJones/m2m100_418M-ft-cy-to-en