GTimothee commited on
Commit
87411e9
·
verified ·
1 Parent(s): f4c6a43

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -1
README.md CHANGED
@@ -12,4 +12,37 @@ tags:
12
  short_description: Reads an audio file and returns its transcript.
13
  ---
14
 
15
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  short_description: Reads an audio file and returns its transcript.
13
  ---
14
 
15
+ # Text2speech for your agent using Whisper through the huggingface API
16
+
17
+ A simple tool for prototyping agents that can extract text from audio.
18
+
19
+ This tool
20
+ 1. opens and reads an audio file
21
+ 2. calls huggingface api with your hf token to get a transcript
22
+ 3. returns the string
23
+
24
+ Useful for implementing vocal commands.
25
+
26
+ # Usage
27
+
28
+ ```python
29
+ from smolagents import Tool
30
+ from smolagents import CodeAgent
31
+ from smolagents import HfApiModel
32
+
33
+ hf_text2speech_tool = Tool.from_hub(
34
+ "GTimothee/hf_text2speech_tool",
35
+ token=<yourtokenhere>,
36
+ trust_remote_code=True
37
+ )
38
+
39
+ model = HfApiModel("Qwen/Qwen2.5-Coder-32B-Instruct", token=<yourtokenhere>)
40
+ agent = CodeAgent(tools=[hf_text2speech_tool], model=model)
41
+ output = agent.run(
42
+ "Use your tools to read the audio file and return the transcription.",
43
+ additional_args={
44
+ 'audio_filepath': filepath,
45
+ 'hf_token': <yourtokenhere>,
46
+ 'model_for_transcription': 'whisper-small.en'}
47
+ )
48
+ ```