metadata

title: Audio to Text
emoji: ▶︎ •၊၊||၊|။||||။‌‌‌‌‌၊|• 0:10 ➤ 📄
colorFrom: blue
colorTo: yellow
sdk: gradio
app_file: app.py
pinned: false
license: apache-2.0

Whisper Small Model Demo

This Space demonstrates the capabilities of OpenAI's Whisper small model for automatic speech recognition (ASR). Users can upload audio files or record audio directly to obtain transcriptions.

Overview

Whisper is a state-of-the-art ASR model developed by OpenAI. This demo utilizes the small variant of Whisper to transcribe spoken language into text. The application is built using Gradio, which provides an intuitive web interface for machine learning models.

Features

Audio Input: Upload pre-recorded audio files or record audio in real-time.
Transcription: Generate text transcriptions of the input audio.
Language Support: Whisper supports multiple languages; however, this demo is optimized for English.

Usage

Select Input Method:
- Upload: Click on the "Upload" button to select an audio file from your device.
- Record: Use the "Record" button to capture audio using your microphone.
Transcription:
- After providing the audio input, click on the "Transcribe" button.
- The transcription will appear in the output box below.

Requirements

To run this demo locally, ensure you have the following installed:

Python 3.8 or higher
Required Python packages listed in requirements.txt

Setup Instructions

Clone the Repository:

git clone https://huggingface.co/spaces/your-username/whisper-small-demo
cd whisper-small-demo

Install Dependencies:
```
pip install -r requirements.txt
```
Run the Application:
```
python app.py
```
Access the demo locally at http://localhost:7860.

Acknowledgements

OpenAI for developing the Whisper model.
Gradio for providing an easy-to-use interface for machine learning applications.
Hugging Face Spaces for hosting this demo.

Spaces:

deepakkumar07
/

whisper-small-demo

Sleeping