Spaces:

deepakkumar07
/

whisper-small-demo

Running

App Files Files Community

deepakkumar07 commited on Mar 12

Commit

23794e8

verified ·

1 Parent(s): 00e0bfb

Uploading whisper small model's demo app.py

Browse files

Files changed (3) hide show

README.md +73 -5
app.py +21 -0
requirements.txt +3 -0

README.md CHANGED Viewed

@@ -1,12 +1,80 @@
 ---
-title: Whisper Small Demo
-emoji: 💻
 colorFrom: blue
-colorTo: red
 sdk: gradio
-sdk_version: 5.20.1
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Audio to Text
+emoji: ▶︎ •၊၊||၊|။||||။‌‌‌‌‌၊|• 0:10 ➤ 📄
 colorFrom: blue
+colorTo: yellow
 sdk: gradio
 app_file: app.py
 pinned: false
+license: apache-2.0
 ---
+# Whisper Small Model Demo
+This Space demonstrates the capabilities of OpenAI's Whisper small model for automatic speech recognition (ASR). Users
+can upload audio files or record audio directly to obtain transcriptions.
+## Overview
+Whisper is a state-of-the-art ASR model developed by OpenAI. This demo utilizes the small variant of Whisper to
+transcribe spoken language into text. The application is built using [Gradio](https://gradio.app/), which provides an
+intuitive web interface for machine learning models.
+## Features
+- **Audio Input**: Upload pre-recorded audio files or record audio in real-time.
+- **Transcription**: Generate text transcriptions of the input audio.
+- **Language Support**: Whisper supports multiple languages; however, this demo is optimized for English.
+## Usage
+1. **Select Input Method**:
+    - *Upload*: Click on the "Upload" button to select an audio file from your device.
+    - *Record*: Use the "Record" button to capture audio using your microphone.
+2. **Transcription**:
+    - After providing the audio input, click on the "Transcribe" button.
+    - The transcription will appear in the output box below.
+## Requirements
+To run this demo locally, ensure you have the following installed:
+- Python 3.8 or higher
+- Required Python packages listed in `requirements.txt`
+## Setup Instructions
+1. **Clone the Repository**:
+   ```bash
+   git clone https://huggingface.co/spaces/your-username/whisper-small-demo
+   cd whisper-small-demo
+   ```
+2. **Install Dependencies**:
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. **Run the Application**:
+   ```bash
+   python app.py
+   ```
+   Access the demo locally at `http://localhost:7860`.
+## Acknowledgements
+- [OpenAI](https://openai.com/) for developing the Whisper model.
+- [Gradio](https://gradio.app/) for providing an easy-to-use interface for machine learning applications.
+- [Hugging Face Spaces](https://huggingface.co/spaces) for hosting this demo.
+## References
+- [OpenAI Whisper GitHub Repository](https://github.com/openai/whisper)
+- [Gradio Documentation](https://gradio.app/docs/)
+- [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces)

app.py ADDED Viewed

	@@ -0,0 +1,21 @@

+import torch
+import gradio as gr
+from transformers import pipeline
+pipe = pipeline(task="automatic-speech-recognition",
+                model="openai/whisper-small",
+                device="cuda" if torch.cuda.is_available() else "cpu")
+def transcribe(audio):
+    text = pipe(audio)["text"]
+    return text
+interface = gr.Interface(
+    fn=transcribe,
+    inputs=gr.Audio(sources=["microphone", "upload"], type="filepath"),
+    outputs="text",
+    title="Whisper Small",
+    description="Realtime demo for  Speech recognition using a Whisper small model.",
+)
+if __name__ == "__main__":
+    interface.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,3 @@

+gradio
+torch
+transformers