Automatic Speech Recognition
Transformers
Safetensors
Japanese
whisper
audio
hf-asr-leaderboard
asahi417 commited on
Commit
ce9c566
·
verified ·
1 Parent(s): dec784f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -39
README.md CHANGED
@@ -8,19 +8,8 @@ tags:
8
  metrics:
9
  - wer
10
  - cer
11
- widget:
12
- - example_title: CommonVoice 8.0 (Test Split)
13
- src: >-
14
- https://huggingface.co/datasets/japanese-asr/ja_asr.common_voice_8_0/resolve/main/sample.flac
15
- - example_title: JSUT Basic 5000
16
- src: >-
17
- https://huggingface.co/datasets/japanese-asr/ja_asr.jsut_basic5000/resolve/main/sample.flac
18
- - example_title: ReazonSpeech (Test Split)
19
- src: >-
20
- https://huggingface.co/datasets/japanese-asr/ja_asr.reazonspeech_test/resolve/main/sample.flac
21
- pipeline_tag: automatic-speech-recognition
22
  model-index:
23
- - name: kotoba-tech/kotoba-whisper-v2.1
24
  results:
25
  - task:
26
  type: automatic-speech-recognition
@@ -28,36 +17,40 @@ model-index:
28
  name: CommonVoice_8.0 (Japanese)
29
  type: japanese-asr/ja_asr.common_voice_8_0
30
  metrics:
31
- - type: WER
32
- value: 59.27
33
- name: WER
34
- - type: CER
35
- value: 9.44
36
- name: CER
37
  - task:
38
  type: automatic-speech-recognition
39
  dataset:
40
  name: ReazonSpeech (Test)
41
  type: japanese-asr/ja_asr.reazonspeech_test
42
  metrics:
43
- - type: WER
44
- value: 56.62
45
- name: WER
46
- - type: CER
47
- value: 12.6
48
- name: CER
49
  - task:
50
  type: automatic-speech-recognition
51
  dataset:
52
  name: JSUT Basic5000
53
  type: japanese-asr/ja_asr.jsut_basic5000
54
  metrics:
55
- - type: WER
56
- value: 64.36
57
- name: WER
58
- - type: CER
59
- value: 8.48
60
- name: CER
 
 
 
 
61
  ---
62
 
63
  # Kotoba-Whisper-v2.1
@@ -74,15 +67,15 @@ along with the.
74
 
75
  | model | CommonVoice 8.0 (Japanese) | JSUT Basic 5000 | ReazonSpeech Test |
76
  |:---------------------------------------------------------|---------------------------------------:|-------------------------------------:|----------------------------------------:|
77
- | kotoba-tech/kotoba-whisper-v2.0 | 15.6 | 15.2 | 17.8 |
78
- | kotoba-tech/kotoba-whisper-v2.1 (punctuator + stable-ts) | 13.7 | ***11.2*** | ***17.4*** |
79
- | kotoba-tech/kotoba-whisper-v2.1 (punctuator) | 13.9 | 11.4 | 18 |
80
- | kotoba-tech/kotoba-whisper-v2.1 (stable-ts) | 15.7 | 15 | 17.7 |
81
- | kotoba-tech/kotoba-whisper-v1.0 | 15.6 | 15.2 | 17.8 |
82
- | kotoba-tech/kotoba-whisper-v1.1 (punctuator + stable-ts) | 13.7 | ***11.2*** | ***17.4*** |
83
- | kotoba-tech/kotoba-whisper-v1.1 (punctuator) | 13.9 | 11.4 | 18 |
84
- | kotoba-tech/kotoba-whisper-v1.1 (stable-ts) | 15.7 | 15 | 17.7 |
85
- | openai/whisper-large-v3 | ***12.9*** | 13.4 | 20.6 |
86
 
87
  Regarding to the normalized CER, since those update from v2.1 will be removed by the normalization, kotoba-tech/kotoba-whisper-v2.1 marks the same CER values as [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0).
88
 
 
8
  metrics:
9
  - wer
10
  - cer
 
 
 
 
 
 
 
 
 
 
 
11
  model-index:
12
+ - name: kotoba-tech/kotoba-whisper-v2.0
13
  results:
14
  - task:
15
  type: automatic-speech-recognition
 
17
  name: CommonVoice_8.0 (Japanese)
18
  type: japanese-asr/ja_asr.common_voice_8_0
19
  metrics:
20
+ - name: WER
21
+ type: WER
22
+ value: 58.9
23
+ - name: CER
24
+ type: CER
25
+ value: 9.2
26
  - task:
27
  type: automatic-speech-recognition
28
  dataset:
29
  name: ReazonSpeech (Test)
30
  type: japanese-asr/ja_asr.reazonspeech_test
31
  metrics:
32
+ - name: WER
33
+ type: WER
34
+ value: 55.6
35
+ - name: CER
36
+ type: CER
37
+ value: 11.63
38
  - task:
39
  type: automatic-speech-recognition
40
  dataset:
41
  name: JSUT Basic5000
42
  type: japanese-asr/ja_asr.jsut_basic5000
43
  metrics:
44
+ - name: WER
45
+ type: WER
46
+ value: 63.8
47
+ - name: CER
48
+ type: CER
49
+ value: 8.4
50
+ datasets:
51
+ - japanese-asr/whisper_transcriptions.reazonspeech.all
52
+ - japanese-asr/whisper_transcriptions.reazonspeech.all.wer_10.0
53
+ - japanese-asr/whisper_transcriptions.reazonspeech.all.wer_10.0.vectorized
54
  ---
55
 
56
  # Kotoba-Whisper-v2.1
 
67
 
68
  | model | CommonVoice 8.0 (Japanese) | JSUT Basic 5000 | ReazonSpeech Test |
69
  |:---------------------------------------------------------|---------------------------------------:|-------------------------------------:|----------------------------------------:|
70
+ | [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (punctuator + stable-ts) | 13.7 | 11.4 | 17 |
71
+ | [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (punctuator) | 13.8 | 11.6 | 17.3 |
72
+ | [kotoba-tech/kotoba-whisper-v2.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.1) (stable-ts) | 15.5 | 15.4 | 17 |
73
+ | [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0) | 15.4 | 15.4 | 17.4 |
74
+ | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator + stable-ts) | 13.7 | 11.2 | 17.4 |
75
+ | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (punctuator) | 13.9 | 11.4 | 18 |
76
+ | [kotoba-tech/kotoba-whisper-v1.1](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.1) (stable-ts) | 15.7 | 15 | 17.7 |
77
+ | [kotoba-tech/kotoba-whisper-v1.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v1.0) | 15.6 | 15.2 | 17.8 |
78
+ | [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) | 12.9 | 13.4 | 20.6 |
79
 
80
  Regarding to the normalized CER, since those update from v2.1 will be removed by the normalization, kotoba-tech/kotoba-whisper-v2.1 marks the same CER values as [kotoba-tech/kotoba-whisper-v2.0](https://huggingface.co/kotoba-tech/kotoba-whisper-v2.0).
81