File size: 915 Bytes
ecd8fee
4efbce4
 
ecd8fee
 
 
 
 
 
 
 
4efbce4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
---
title: Real-time Speech-to-Text
emoji: ๐ŸŽ™๏ธ
colorFrom: indigo
colorTo: gray
sdk: gradio
sdk_version: 5.29.0
app_file: app.py
pinned: false
---

# Real-time Speech-to-Text with NeMo

This is a real-time speech-to-text transcription application powered by NVIDIA NeMo and the parakeet-tdt-0.6b-v2 model.

## Features

- ๐ŸŽ™๏ธ Web-based microphone input
- โšก Real-time transcription displayed in the browser
- ๐Ÿง  Fast inference with NeMo pre-trained model
- ๐Ÿ› ๏ธ Easy to use, no installations required

## Tech Stack

- Python
- Gradio
- NVIDIA NeMo Toolkit for ASR

## How to Use

1. Click the microphone button to start recording
2. Speak clearly into your microphone
3. The transcription will appear in real-time
4. Click 'Clear Transcript' to start a new transcription

## Note

This application requires access to your microphone to function. The audio is processed in real-time and is not stored.