lora-gpt2-e2e-reproduce

This model is a fine-tuned version of gpt2-medium on the e2e_nlg dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 8
eval_batch_size: 4
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10
mixed_precision_training: Native AMP
label_smoothing_factor: 0.1

Training Loss	Epoch	Step	Validation Loss	Bleu
2.9523	0.5706	3000	2.6028	0.3489
2.6924	1.1411	6000	2.5544	0.3501
2.6493	1.7117	9000	2.5217	0.4052
2.6252	2.2822	12000	2.5048	0.3894
2.6023	2.8528	15000	2.4957	0.4060
2.5962	3.4234	18000	2.4863	0.3772
2.5797	3.9939	21000	2.4812	0.3697
2.5691	4.5645	24000	2.4746	0.3864
2.5677	5.1350	27000	2.4708	0.3709
2.553	5.7056	30000	2.4648	0.3787
2.5567	6.2762	33000	2.4610	0.3754
2.5469	6.8467	36000	2.4593	0.3670
2.5422	7.4173	39000	2.4566	0.3663
2.5376	7.9878	42000	2.4548	0.3621
2.534	8.5584	45000	2.4538	0.3812
2.5279	9.1289	48000	2.4532	0.3695
2.5273	9.6995	51000	2.4493	0.3781