deepakkumar07 commited on
Commit
23794e8
·
verified ·
1 Parent(s): 00e0bfb

Uploading whisper small model's demo app.py

Browse files
Files changed (3) hide show
  1. README.md +73 -5
  2. app.py +21 -0
  3. requirements.txt +3 -0
README.md CHANGED
@@ -1,12 +1,80 @@
1
  ---
2
- title: Whisper Small Demo
3
- emoji: 💻
 
4
  colorFrom: blue
5
- colorTo: red
6
  sdk: gradio
7
- sdk_version: 5.20.1
8
  app_file: app.py
9
  pinned: false
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Audio to Text
3
+ emoji: ▶︎ •၊၊||၊|။||||။‌‌‌‌‌၊|• 0:10 ➤ 📄
4
+
5
  colorFrom: blue
6
+ colorTo: yellow
7
  sdk: gradio
 
8
  app_file: app.py
9
  pinned: false
10
+ license: apache-2.0
11
  ---
12
 
13
+ # Whisper Small Model Demo
14
+
15
+ This Space demonstrates the capabilities of OpenAI's Whisper small model for automatic speech recognition (ASR). Users
16
+ can upload audio files or record audio directly to obtain transcriptions.
17
+
18
+ ## Overview
19
+
20
+ Whisper is a state-of-the-art ASR model developed by OpenAI. This demo utilizes the small variant of Whisper to
21
+ transcribe spoken language into text. The application is built using [Gradio](https://gradio.app/), which provides an
22
+ intuitive web interface for machine learning models.
23
+
24
+ ## Features
25
+
26
+ - **Audio Input**: Upload pre-recorded audio files or record audio in real-time.
27
+ - **Transcription**: Generate text transcriptions of the input audio.
28
+ - **Language Support**: Whisper supports multiple languages; however, this demo is optimized for English.
29
+
30
+ ## Usage
31
+
32
+ 1. **Select Input Method**:
33
+ - *Upload*: Click on the "Upload" button to select an audio file from your device.
34
+ - *Record*: Use the "Record" button to capture audio using your microphone.
35
+
36
+ 2. **Transcription**:
37
+ - After providing the audio input, click on the "Transcribe" button.
38
+ - The transcription will appear in the output box below.
39
+
40
+ ## Requirements
41
+
42
+ To run this demo locally, ensure you have the following installed:
43
+
44
+ - Python 3.8 or higher
45
+ - Required Python packages listed in `requirements.txt`
46
+
47
+ ## Setup Instructions
48
+
49
+ 1. **Clone the Repository**:
50
+
51
+ ```bash
52
+ git clone https://huggingface.co/spaces/your-username/whisper-small-demo
53
+ cd whisper-small-demo
54
+ ```
55
+
56
+ 2. **Install Dependencies**:
57
+
58
+ ```bash
59
+ pip install -r requirements.txt
60
+ ```
61
+
62
+ 3. **Run the Application**:
63
+
64
+ ```bash
65
+ python app.py
66
+ ```
67
+
68
+ Access the demo locally at `http://localhost:7860`.
69
+
70
+ ## Acknowledgements
71
+
72
+ - [OpenAI](https://openai.com/) for developing the Whisper model.
73
+ - [Gradio](https://gradio.app/) for providing an easy-to-use interface for machine learning applications.
74
+ - [Hugging Face Spaces](https://huggingface.co/spaces) for hosting this demo.
75
+
76
+ ## References
77
+
78
+ - [OpenAI Whisper GitHub Repository](https://github.com/openai/whisper)
79
+ - [Gradio Documentation](https://gradio.app/docs/)
80
+ - [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)
app.py ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import gradio as gr
3
+ from transformers import pipeline
4
+
5
+ pipe = pipeline(task="automatic-speech-recognition",
6
+ model="openai/whisper-small",
7
+ device="cuda" if torch.cuda.is_available() else "cpu")
8
+
9
+ def transcribe(audio):
10
+ text = pipe(audio)["text"]
11
+ return text
12
+
13
+ interface = gr.Interface(
14
+ fn=transcribe,
15
+ inputs=gr.Audio(sources=["microphone", "upload"], type="filepath"),
16
+ outputs="text",
17
+ title="Whisper Small",
18
+ description="Realtime demo for Speech recognition using a Whisper small model.",
19
+ )
20
+ if __name__ == "__main__":
21
+ interface.launch()
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ gradio
2
+ torch
3
+ transformers