File size: 2,318 Bytes
fec90af
 
e968c69
fec90af
 
 
 
 
 
fc602df
913d1e8
1fae59e
09c64fc
913d1e8
1fae59e
ce59ec4
1fae59e
913d1e8
 
 
 
 
 
 
 
 
 
 
 
d63f2a7
 
913d1e8
1fae59e
913d1e8
 
 
 
 
 
 
d63f2a7
 
bd5fef4
 
 
ce59ec4
d63f2a7
 
 
 
 
 
 
 
1fae59e
ce59ec4
c64dfc8
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
title: README
emoji: πŸš€
colorFrom: purple
colorTo: gray
sdk: static
pinned: false
---

Multilingual language models have many deployment challenges.
![Deployment Challenges](DeploymentChallenges.png)

Can we engineer multilingual language models that not only match the prowess of their bulkier counterparts but do so while being more compact, quicker on their feet, and capable of handling massive data batches in real-time production environments. Is this a feat we can achieve?
![MemoryVariations through time](MemoryVariations(Latency).png)

# Techniques:
- Pruning
  - Unstructured Pruning
  - Structured Pruning
  - Semi-Structured Pruning

  - Methods Used
    - SparseGPT | [GitHub](https://github.com/VishnuVardhanSaiLanka/sparsegpt/tree/aya)
    - ShortGPT | [KLDBasedPruning & Perplexity Sensivities](https://github.com/rsk2327/DistAya/tree/main)

- Knowledge Distillation 
  - Hidden State-Based Distillation ~ [DistillKit](https://arcee-ai-distillkit.my.canva.site/) | [GitHub](https://github.com/ShayekhBinIslam/DistillKit)
  - Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
  - On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
  - Minitron: Compact Language models via Pruning & Knowledge Distillation
  - DistiLLM: Towards Streamlined Distillation for Large Language Models

- Quantization
  - Quantization Aware Training (QAT)
  - Post Training Quantization (PTQ)
    - KV Cache Quantization
    - Weight & Activation Quantization

- Low-Rank Factorization

- Fine-Tuning | [GitHub](https://github.com/rsk2327/DistAya/tree/track/fine-tuning)

![Techniques](Techniques.png)


# Datasets:
Initial 7 datasets unified, having 6.62M rows which includes the following:
- Bangla_Alpaca_Orca : Bangle
- Urdu_Instruct_News_Article_Generation: Urdu
- Urdu_Instruct_News_Headline_Generation: Urdu
- Urdu_Instruct_News_Category_Classification: Urdu
- cidar: Arabic
- Six_Millions_Instruction_Dataset_For_Arabic_Llm_Ft: Arabic
- instructv3: English

## Get in touch with the team:
- Mayank Bhaskar -> [email protected]
- Ahmad Anis -> [email protected]
- Drishti Sharma -> [email protected]
- Vishnu Vardhan -> [email protected]
- Yaya -> [email protected]
- Shayekh Bin Islam -> [email protected]