Command_RTC / README.md
RSHVR's picture
Update README.md
f31bed1 verified

A newer version of the Gradio SDK is available: 5.29.0

Upgrade
metadata
title: Tortoise TTS API
emoji: 🦀
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.23.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Text-to-speech using Gradio, FastAPI, and TorToise TTS
tags:
  - tortoise-tts
  - text-to-speech
  - voice-cloning
  - gradio
  - fastapi

Voice Chat Assistant

A conversational voice assistant powered by AI that responds to your spoken queries with natural-sounding speech.

Features

  • Speech Recognition: Uses OpenAI's Whisper model to accurately transcribe your voice
  • Natural Language Understanding: Leverages Cohere's LLM API for intelligent responses
  • Text-to-Speech: Generates natural speech using Tortoise-TTS
  • Reply on Pause: Automatically responds when you finish speaking
  • Conversation History: Maintains context throughout your dialogue

Demo

Speak into your microphone and the assistant will respond with voice!

How It Works

  • Your voice is transcribed to text using Whisper
  • The text is processed by Cohere's LLM to generate a response
  • The response is converted to speech using Tortoise-TTS
  • The conversation continues with full context retention

Technical Details

This project utilizes:

  • Zero-GPU: Efficient GPU memory usage with Hugging Face's Zero-GPU technology
  • FastRTC: Real-time communication for seamless voice interaction
  • Gradio: Simple and intuitive user interface

Setup

To run this locally, you'll need a Cohere API key and Python 3.8+.

Acknowledgements

  • OpenAI for the Whisper speech recognition model
  • Cohere for the language model API
  • Tortoise-TTS for the text-to-speech capabilities
  • Hugging Face for the Spaces and Zero-GPU infrastructure