File size: 1,709 Bytes
93d50a2
a5434d9
 
 
 
93d50a2
a5434d9
 
93d50a2
 
a5434d9
 
60c7aa2
a5434d9
 
 
 
 
 
60c7aa2
a5434d9
 
 
 
60c7aa2
a5434d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
title: Talklas 2
emoji: πŸš€
colorFrom: blue
colorTo: green
sdk: docker
app_port: 8000
health_check_path: /health
---

# Talklas API

This FastAPI app is deployed on Hugging Face Spaces for audio transcription, translation, and text-to-speech (TTS). The application loads all necessary models at startup and updates each model on demand, including a feature for detecting inappropriate language. It includes the following endpoints:

- `/`: Returns a simple health check response.
- `/health`: Health check endpoint for Hugging Face Spaces.
- `/update-languages`: Updates the source and target languages for STT and TTS models.
- `/translate-text`: Translates text and converts it to speech.
- `/translate-audio`: Transcribes audio, translates the text, and converts the translated text to speech. Includes speech detection to handle silent audio gracefully.
- `/text-to-speech`: This will generate a speech from the text and target language given to the server

## Features

- **Speech Detection**: The `/translate-audio` endpoint detects if the audio is silent (no speech) and returns a user-friendly response.
- **Transcription (STT)**: Uses Whisper for English and Tagalog source language and MMS for other Philippine languages to transcribe audio.
- **Translation (MT)**: Uses the NLLB-200 model to translate text between supported languages.
- **Text-to-Speech (TTS)**: Uses MMS-TTS models to convert translated text to speech.

## Supported Languages

- English
- Tagalog
- Cebuano
- Ilocano
- Waray
- Pangasinan

## Deployment

This app uses a `Dockerfile` to deploy a FastAPI app with Uvicorn. The health check path is set to `/health` to ensure Hugging Face Spaces can verify the app is running.