Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# Telugu Text Tokenizer
|
2 |
|
3 |
A Gradio web interface for encoding and decoding Telugu text using a trained BPE tokenizer.
|
@@ -34,17 +49,3 @@ The tokenizer is trained on a diverse corpus of Telugu text with:
|
|
34 |
- Target compression ratio: ≥ 3.2x
|
35 |
- Perfect reconstruction guarantee
|
36 |
|
37 |
-
---
|
38 |
-
- title: Bpe Tokenizer
|
39 |
-
- emoji: 🔥
|
40 |
-
- colorFrom: blue
|
41 |
-
- colorTo: yellow
|
42 |
-
- sdk: gradio
|
43 |
-
- sdk_version: 5.12.0
|
44 |
-
- app_file: app.py
|
45 |
-
- pinned: false
|
46 |
-
- license: apache-2.0
|
47 |
-
- short_description: Telugu BPE tokenizer with vocabulary of 4800 words.
|
48 |
-
---
|
49 |
-
|
50 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
1 |
+
---
|
2 |
+
title: Bpe Tokenizer
|
3 |
+
emoji: 🔥
|
4 |
+
colorFrom: blue
|
5 |
+
colorTo: yellow
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 5.12.0
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
+
license: apache-2.0
|
11 |
+
short_description: Telugu BPE tokenizer with vocabulary of 4800 words.
|
12 |
+
---
|
13 |
+
|
14 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
15 |
+
|
16 |
# Telugu Text Tokenizer
|
17 |
|
18 |
A Gradio web interface for encoding and decoding Telugu text using a trained BPE tokenizer.
|
|
|
49 |
- Target compression ratio: ≥ 3.2x
|
50 |
- Perfect reconstruction guarantee
|
51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|