kishkath commited on
Commit
45095f6
·
verified ·
1 Parent(s): 0d5eb64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -32,4 +32,19 @@ A Gradio web interface for encoding and decoding Telugu text using a trained BPE
32
  The tokenizer is trained on a diverse corpus of Telugu text with:
33
  - Maximum vocabulary size: 5000 tokens
34
  - Target compression ratio: ≥ 3.2x
35
- - Perfect reconstruction guarantee
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  The tokenizer is trained on a diverse corpus of Telugu text with:
33
  - Maximum vocabulary size: 5000 tokens
34
  - Target compression ratio: ≥ 3.2x
35
+ - Perfect reconstruction guarantee
36
+
37
+ ---
38
+ title: Bpe Tokenizer
39
+ emoji: 🔥
40
+ colorFrom: blue
41
+ colorTo: yellow
42
+ sdk: gradio
43
+ sdk_version: 5.12.0
44
+ app_file: app.py
45
+ pinned: false
46
+ license: apache-2.0
47
+ short_description: Telugu BPE tokenizer with vocabulary of 4800 words.
48
+ ---
49
+
50
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference