Spaces:

MonicaDasari
/

FinalProject

Configuration error

File size: 4,045 Bytes

Here’s a **README** template for your project, designed to highlight the models used, evaluation methodology, and key results. You can adapt this for Hugging Face or any similar platform.

---

# **English-to-Japanese Translation Project**

## **Overview**
This project focuses on building a robust system for English-to-Japanese translation using state-of-the-art multilingual models. Two models were used: **mT5** as the primary model and **mBART** as the secondary model. Together, they ensure high-quality translations and versatility in multilingual tasks.

---

## **Models Used**

### **1. mT5 (Primary Model)**
- **Reason for Selection**:  
  - mT5 is highly versatile and trained on a broad multilingual dataset, making it suitable for translation and other tasks like summarization or answering questions.  
  - It performs well without extensive fine-tuning, saving computational resources.  

- **Strengths**:  
  - Handles translation naturally with minimal training.  
  - Can perform additional tasks beyond translation.  

- **Limitations**:  
  - Sometimes lacks precision in detailed translations.  

---

### **2. mBART (Secondary Model)**
- **Reason for Selection**:  
  - mBART specializes in multilingual translation tasks and provides highly accurate translations when fine-tuned.  

- **Strengths**:  
  - Optimized for translation accuracy, especially for long sentences and contextual consistency.  
  - Handles grammatical and contextual errors well.  

- **Limitations**:  
  - Less flexible for tasks like summarization or question answering compared to mT5.  

---

## **Evaluation Strategy**

To evaluate model performance, the following metrics were used:  

1. **BLEU Score**:  
   - Measures how close the model's output is to the correct translation.  
   - Chosen because it is a standard for evaluating translation accuracy.  

2. **Training Loss**:  
   - Tracks how well the model is learning during training.  
   - A lower loss shows better learning and fewer errors.  

3. **Perplexity**:  
   - Checks the confidence of the model’s predictions.  
   - Lower perplexity means fewer mistakes and more fluent translations.  

---

## **Steps Taken**
1. Fine-tuned both models using a dataset of English-Japanese text pairs to improve translation accuracy.  
2. Tested the models on unseen data to measure their real-world performance.  
3. Applied optimizations like **4-bit quantization** to reduce memory usage and make the models faster during evaluation.  

---

## **Results**
- **mT5**:  
  - Performed well in handling translations and additional tasks like summarization and answering questions.  
  - Showed versatility but sometimes lacked detailed accuracy for translations.  

- **mBART**:  
  - Delivered precise and contextually accurate translations, especially for longer sentences.  
  - Required fine-tuning but outperformed mT5 in translation-focused tasks.  

- **Overall Conclusion**:  
  mT5 is a flexible model for multilingual tasks, while mBART ensures high-quality translations. Together, they balance versatility and accuracy, making them ideal for English-to-Japanese translations.  

---

## **How to Use**
1. Load the models from Hugging Face:  
   - [mT5 Model on Hugging Face](https://huggingface.co/google/mt5-small)  
   - [mBART Model on Hugging Face](https://huggingface.co/facebook/mbart-large-50)  

2. Fine-tune the models for your dataset using English-Japanese text pairs.  
3. Evaluate performance using BLEU Score, training loss, and perplexity.  

---

## **Future Work**
- Expand the dataset for better fine-tuning.  
- Explore task-specific fine-tuning for mT5 to improve its translation accuracy.  
- Optimize the models further for deployment in resource-constrained environments.  

---

## **References**
- [mT5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/2010.11934)  
- [mBART: Multilingual Denoising Pretraining for Neural Machine Translation](https://arxiv.org/abs/2001.08210)  

---