Whisper Small Optimized for Stuttered Speech

This model is a fine-tuned version of openai/whisper-small on the TimeStamped dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
training_steps: 8000
mixed_precision_training: Native AMP
label_smoothing_factor: 0.1

Training Loss	Epoch	Step	Validation Loss	Wer	Wer Ortho	Cer
1.5552	5.8187	500	1.7178	23.1142	12.5690	12.5620
1.4449	11.6316	1000	1.7658	14.3774	9.1288	9.1103
1.4477	17.4444	1500	1.8778	18.3517	13.5472	13.5171
1.4132	23.2573	2000	1.8607	13.6005	7.7101	7.6846
1.4065	29.0702	2500	1.8845	14.4112	8.2271	8.1993
1.4182	34.8889	3000	1.9307	14.4112	7.9953	7.9675
1.4177	40.7018	3500	1.9481	17.6649	10.8814	10.8535
1.4036	46.5146	4000	1.9508	15.2105	8.5331	8.5076
1.4012	52.3275	4500	1.9831	15.4695	8.7324	8.7069
1.4005	58.1404	5000	2.0116	15.6046	8.8252	8.7973
1.4143	63.9591	5500	2.0306	15.6609	8.9318	8.9040
1.4141	69.7719	6000	2.0445	15.7172	8.9573	8.9295
1.414	75.5848	6500	2.0525	15.8410	9.0083	8.9805
1.3998	81.3977	7000	2.0598	15.8523	9.0361	9.0060
1.3997	87.2105	7500	2.0625	15.8635	9.0454	9.0153
1.3997	93.0234	8000	2.0628	15.8523	9.0384	9.0083