Spaces:
Runtime error
Runtime error
import gradio as gr | |
import openai | |
from constants import * | |
openai.api_key = OPENAI_API_KEY | |
title = "Car Seats Voice Commands" | |
description = """ | |
This is a demo for controlling car seats with Voice Commands, On the left there's the inputs section | |
and on the right you'll find your outputs. For the inputs you have two choices **Voice** and **Text**, | |
Use **Voice** If you want a closer experience to the final product, Or use **Text** if you just want to test the command model. | |
for the outputs you have the **transcription**(Please check that it's accurate), **command**(to know which | |
command the system detected) and you have the robot voice (again use this if you want a more real experience). | |
**Features** : You can either activate of deactivate the following features | |
- Heated Seats | |
- Cooled Seats | |
- Massage Seats | |
Examples: | |
- **Direct Commands** : Try to say something like "Activate heated seats" or "Turn Off massage seats" | |
- **Indirect Commands** : Try "My back is cold" , "No heating is needed anymore" or "I'm stressed today" | |
""" | |
article = """ | |
This demo processes commands in two steps, the first step is the transcription phase and the second is the | |
Command Classification phase. For Transcription I used The OpenAi whisper model, and for the classification | |
I Fine-Tuned the OpenAi **ada** model on Car Seats Command. | |
""" | |
# def get_command(command, model, id2label): | |
# """ | |
# This function get the classification outputs from openai API | |
# """ | |
# completion = openai.Completion.create( | |
# model=model, prompt=f"{command}->", max_tokens=1, temperature=0 | |
# ) | |
# id = int(completion["choices"][0]["text"].strip()) | |
# result = id2label[id] if id in id2label else "unknown" | |
# return result | |
def get_command(command, model, id2label): | |
""" | |
This function get the classification outputs from openai API | |
""" | |
prompt = f""" | |
We want to control the seats of a car which has features to cool, heat, or massage a seat. The user said "{command}", Which feature we should use to ensure user comfort? Give just the number of the feature. | |
Mapping: | |
1: "massage_seats_on" | |
2: "massage_seats_off" | |
3: "heated_seats_on" | |
4: "heated_seats_off" | |
5: "cooled_seats_on" | |
6: "cooled_seats_off" | |
Command_Code: | |
""" | |
completion = openai.Completion.create( | |
model="text-davinci-003", prompt=prompt, max_tokens=2, temperature=0 | |
) | |
id = int(completion["choices"][0]["text"].strip()) | |
result = id2label[id] if id in id2label else "unknown" | |
return result | |
def transcribe(audio, text): | |
""" | |
if text provided the function will classify the input directly. | |
if not the audio will be transcribed then the transcription will be classified. | |
""" | |
if text: | |
result = get_command(text, MODEL, id2label) | |
return "Text provided by the user", text_respnses[result], None | |
# getting text transcription | |
audio_file = open(audio, "rb") | |
transcription = openai.Audio.transcribe("whisper-1", audio_file, language="en") | |
transcription = transcription["text"] | |
result = get_command(transcription, MODEL, id2label) | |
audio_res = resoponses.get(result)() | |
return transcription, text_respnses[result], audio_res | |
if __name__ == "__main__": | |
gr.Interface( | |
fn=transcribe, | |
inputs=[ | |
gr.Audio(label="", source="microphone", type="filepath"), | |
gr.Textbox(label="If you prefer type your command (more accurate)"), | |
], | |
outputs=[ | |
gr.Textbox( | |
label="Input Transcription (Please check that this matches what you've said)" | |
), | |
gr.Textbox(label="Machine Response (Text Version)"), | |
gr.Audio(label="Machine Response (Audio Version)"), | |
], | |
allow_flagging="auto", | |
title=title, | |
description=description, | |
article=article, | |
).launch() | |