Spaces:
Paused
Paused
Update README.md
Browse files
README.md
CHANGED
@@ -1,59 +1,45 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# AGI Telecom POC
|
2 |
|
3 |
This Hugging Face Space demonstrates an AGI-powered telecom interface that enables voice and text interaction through telecommunication channels (WebRTC/SIP).
|
4 |
|
5 |
## Overview
|
6 |
|
7 |
-
This proof-of-concept showcases
|
8 |
-
|
9 |
- Multimodal communication (voice + text)
|
10 |
-
- Agentic intelligence (reasoning, memory)
|
11 |
-
- Telecom-enabled delivery
|
12 |
-
|
13 |
-
## Demo Usage
|
14 |
|
15 |
-
|
|
|
|
|
|
|
16 |
|
17 |
-
|
18 |
-
- Upload audio or use text input
|
19 |
-
- Get transcriptions, agent responses, and speech synthesis
|
20 |
-
- Manage conversation sessions
|
21 |
-
|
22 |
-
2. **API Endpoints**: Direct API access for more advanced integration
|
23 |
-
- `/api/transcribe` - Convert audio to text
|
24 |
-
- `/api/query` - Process text with agent
|
25 |
-
- `/api/speak` - Convert text to speech
|
26 |
-
- `/api/session` - Create new conversation sessions
|
27 |
-
|
28 |
-
## Architecture
|
29 |
|
30 |
-
|
31 |
|
32 |
-
|
33 |
-
|
34 |
-
|
|
|
35 |
|
36 |
-
|
|
|
|
|
|
|
|
|
37 |
|
38 |
-
|
39 |
-
|
40 |
-
1. Clone the repository
|
41 |
-
2. Install dependencies: `pip install -r requirements.txt`
|
42 |
-
3. Run the app: `python app.py`
|
43 |
-
4. Open http://localhost:8000 in your browser
|
44 |
-
|
45 |
-
## Notes
|
46 |
-
|
47 |
-
- This demo uses simplified mock implementations
|
48 |
-
- For production use, you would replace the mock functions with:
|
49 |
-
- Whisper for speech-to-text
|
50 |
-
- A proper LLM (like LLAMA, Mistral) for reasoning
|
51 |
-
- A high-quality TTS engine
|
52 |
-
- Full WebRTC/SIP implementation
|
53 |
-
|
54 |
-
## Future Extensions
|
55 |
|
56 |
-
|
57 |
-
- Mesh networking with fallback intelligence
|
58 |
-
- Enhanced multi-agent collaboration
|
59 |
-
- Advanced contextual reasoning
|
|
|
1 |
+
---
|
2 |
+
title: AGI Telecom POC
|
3 |
+
emoji: 📡
|
4 |
+
colorFrom: blue
|
5 |
+
colorTo: indigo
|
6 |
+
sdk: docker
|
7 |
+
sdk_version: "latest"
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
---
|
11 |
+
|
12 |
# AGI Telecom POC
|
13 |
|
14 |
This Hugging Face Space demonstrates an AGI-powered telecom interface that enables voice and text interaction through telecommunication channels (WebRTC/SIP).
|
15 |
|
16 |
## Overview
|
17 |
|
18 |
+
This proof-of-concept showcases:
|
|
|
19 |
- Multimodal communication (voice + text)
|
20 |
+
- Agentic intelligence (reasoning, memory, response)
|
21 |
+
- Telecom-enabled delivery (SIP/WebRTC)
|
|
|
|
|
22 |
|
23 |
+
The system is powered by:
|
24 |
+
- Meta-Llama-3.1-8B-Instruct through Hugging Face Inference Endpoints
|
25 |
+
- Whisper for speech-to-text conversion
|
26 |
+
- Edge TTS for natural-sounding speech synthesis
|
27 |
|
28 |
+
## Using the Interface
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
+
This demo provides two ways to interact with the system:
|
31 |
|
32 |
+
1. **Web Interface**: A user-friendly chat interface with voice capabilities
|
33 |
+
- Type messages or use voice input
|
34 |
+
- See real-time visualizations of audio
|
35 |
+
- Experience AI responses via text and speech
|
36 |
|
37 |
+
2. **API Endpoints**: Direct access for integration
|
38 |
+
- `/query` - Process text with agent
|
39 |
+
- `/transcribe` - Convert audio to text
|
40 |
+
- `/speak` - Convert text to speech
|
41 |
+
- `/complete_flow` - End-to-end processing
|
42 |
|
43 |
+
## Architecture
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
+
The system follows this processing flow:
|
|
|
|
|
|