whisper-large-v3-zwksa1704v2

This model is a fine-tuned version of openai/whisper-large-v3 on the audiofolder dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0802
  • Wer: 46.6417

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • training_steps: 1200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.5554 1.8868 100 0.6266 47.2539
0.2544 3.7736 200 0.6527 47.5939
0.094 5.6604 300 0.7754 49.3198
0.0399 7.5472 400 0.8516 47.3984
0.0181 9.4340 500 0.9104 47.0498
0.0088 11.3208 600 0.9556 46.9138
0.0039 13.2075 700 1.0108 46.6587
0.0019 15.0943 800 1.0358 47.1263
0.0013 16.9811 900 1.0692 46.6672
0.0011 18.8679 1000 1.0767 46.5227
0.0011 20.7547 1100 1.0797 46.6927
0.0011 22.6415 1200 1.0802 46.6417

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.3.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.19.1
Downloads last month
9
Safetensors
Model size
1.54B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for DanaRL/whisper-large-v3-zwksa1704v2

Finetuned
(491)
this model

Evaluation results