Spaces:

GTimothee
/

hf_speech2text_tool

Sleeping

GTimothee commited on Jan 30

Commit

87411e9

verified ·

1 Parent(s): f4c6a43

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -12,4 +12,37 @@ tags:
 short_description: Reads an audio file and returns its transcript.
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 short_description: Reads an audio file and returns its transcript.
 ---
+# Text2speech for your agent using Whisper through the huggingface API
+A simple tool for prototyping agents that can extract text from audio.
+This tool
+1. opens and reads an audio file
+2. calls huggingface api with your hf token to get a transcript
+3. returns the string
+Useful for implementing vocal commands.
+# Usage
+```python
+from smolagents import Tool
+from smolagents import CodeAgent
+from smolagents import HfApiModel
+hf_text2speech_tool = Tool.from_hub(
+  "GTimothee/hf_text2speech_tool",
+  token=<yourtokenhere>,
+  trust_remote_code=True
+)
+model = HfApiModel("Qwen/Qwen2.5-Coder-32B-Instruct", token=<yourtokenhere>)
+agent = CodeAgent(tools=[hf_text2speech_tool], model=model)
+output = agent.run(
+  "Use your tools to read the audio file and return the transcription.",
+  additional_args={
+      'audio_filepath': filepath,
+      'hf_token': <yourtokenhere>,
+      'model_for_transcription': 'whisper-small.en'}
+)
+```