Spaces:
Build error
Build error
title: Tortoise TTS API | |
emoji: 🦀 | |
colorFrom: yellow | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 5.23.1 | |
app_file: app.py | |
pinned: false | |
license: apache-2.0 | |
short_description: Text-to-speech using Gradio, FastAPI, and TorToise TTS | |
tags: | |
- tortoise-tts | |
- text-to-speech | |
- voice-cloning | |
- gradio | |
- fastapi | |
# Voice Chat Assistant | |
A conversational voice assistant powered by AI that responds to your spoken queries with natural-sounding speech. | |
## Features | |
- Speech Recognition: Uses OpenAI's Whisper model to accurately transcribe your voice | |
- Natural Language Understanding: Leverages Cohere's LLM API for intelligent responses | |
- Text-to-Speech: Generates natural speech using Tortoise-TTS | |
- Reply on Pause: Automatically responds when you finish speaking | |
- Conversation History: Maintains context throughout your dialogue | |
## Demo | |
Speak into your microphone and the assistant will respond with voice! | |
## How It Works | |
- Your voice is transcribed to text using Whisper | |
- The text is processed by Cohere's LLM to generate a response | |
- The response is converted to speech using Tortoise-TTS | |
- The conversation continues with full context retention | |
## Technical Details | |
This project utilizes: | |
- Zero-GPU: Efficient GPU memory usage with Hugging Face's Zero-GPU technology | |
- FastRTC: Real-time communication for seamless voice interaction | |
- Gradio: Simple and intuitive user interface | |
## Setup | |
To run this locally, you'll need a Cohere API key and Python 3.8+. | |
## Acknowledgements | |
- OpenAI for the Whisper speech recognition model | |
- Cohere for the language model API | |
- Tortoise-TTS for the text-to-speech capabilities | |
- Hugging Face for the Spaces and Zero-GPU infrastructure |