asr-demo / README.md
GavinHuang's picture
update README for real-time speech-to-text application and remove spaces.GPU decorator from load_model function
4efbce4
metadata
title: Real-time Speech-to-Text
emoji: ๐ŸŽ™๏ธ
colorFrom: indigo
colorTo: gray
sdk: gradio
sdk_version: 5.29.0
app_file: app.py
pinned: false

Real-time Speech-to-Text with NeMo

This is a real-time speech-to-text transcription application powered by NVIDIA NeMo and the parakeet-tdt-0.6b-v2 model.

Features

  • ๐ŸŽ™๏ธ Web-based microphone input
  • โšก Real-time transcription displayed in the browser
  • ๐Ÿง  Fast inference with NeMo pre-trained model
  • ๐Ÿ› ๏ธ Easy to use, no installations required

Tech Stack

  • Python
  • Gradio
  • NVIDIA NeMo Toolkit for ASR

How to Use

  1. Click the microphone button to start recording
  2. Speak clearly into your microphone
  3. The transcription will appear in real-time
  4. Click 'Clear Transcript' to start a new transcription

Note

This application requires access to your microphone to function. The audio is processed in real-time and is not stored.