Spaces:
Sleeping
Sleeping
File size: 1,160 Bytes
8345e04 c50abcc 8345e04 f4c6a43 8345e04 fb61c25 f4c6a43 8345e04 387fbd0 87411e9 c50abcc 87411e9 c50abcc 87411e9 c50abcc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
---
title: Hf Speech2text Tool
emoji: 💻
colorFrom: indigo
colorTo: red
sdk: gradio
sdk_version: 5.13.2
app_file: app.py
pinned: false
tags:
- tool
short_description: Reads an audio file and returns its transcript.
---
# Speech2text tool for your agent
Uses the huggingface API under the hood.
A simple tool for prototyping agents that can extract text from audio.
This tool
1. opens and reads an audio file
2. calls huggingface api with your hf token to get a transcript
3. returns the string
Useful for implementing vocal commands.
# Usage
```python
from smolagents import Tool
from smolagents import CodeAgent
from smolagents import HfApiModel
hf_speech2text_tool = Tool.from_hub(
"GTimothee/hf_text2speech_tool",
token=<yourtokenhere>,
trust_remote_code=True
)
model = HfApiModel("Qwen/Qwen2.5-Coder-32B-Instruct", token=<yourtokenhere>)
agent = CodeAgent(tools=[hf_speech2text_tool], model=model)
output = agent.run(
"Use your tools to read the audio file and return the transcription.",
additional_args={
'audio_filepath': filepath,
'hf_token': <yourtokenhere>,
'model_for_transcription': 'whisper-small.en'}
)
``` |