Spaces:

MonicaDasari
/

FinalProject

Configuration error

App Files Files Community

FinalProject / README.md

MonicaDasari

Update README.md

3b1dd8a verified 5 months ago

preview code

raw

history blame contribute delete

4.05 kB

	Here’s a README template for your project, designed to highlight the models used, evaluation methodology, and key results. You can adapt this for Hugging Face or any similar platform.

	---

	# English-to-Japanese Translation Project

	## Overview
	This project focuses on building a robust system for English-to-Japanese translation using state-of-the-art multilingual models. Two models were used: mT5 as the primary model and mBART as the secondary model. Together, they ensure high-quality translations and versatility in multilingual tasks.

	---

	## Models Used

	### 1. mT5 (Primary Model)
	- Reason for Selection:
	- mT5 is highly versatile and trained on a broad multilingual dataset, making it suitable for translation and other tasks like summarization or answering questions.
	- It performs well without extensive fine-tuning, saving computational resources.

	- Strengths:
	- Handles translation naturally with minimal training.
	- Can perform additional tasks beyond translation.

	- Limitations:
	- Sometimes lacks precision in detailed translations.

	---

	### 2. mBART (Secondary Model)
	- Reason for Selection:
	- mBART specializes in multilingual translation tasks and provides highly accurate translations when fine-tuned.

	- Strengths:
	- Optimized for translation accuracy, especially for long sentences and contextual consistency.
	- Handles grammatical and contextual errors well.

	- Limitations:
	- Less flexible for tasks like summarization or question answering compared to mT5.

	---

	## Evaluation Strategy

	To evaluate model performance, the following metrics were used:

	1. BLEU Score:
	- Measures how close the model's output is to the correct translation.
	- Chosen because it is a standard for evaluating translation accuracy.

	2. Training Loss:
	- Tracks how well the model is learning during training.
	- A lower loss shows better learning and fewer errors.

	3. Perplexity:
	- Checks the confidence of the model’s predictions.
	- Lower perplexity means fewer mistakes and more fluent translations.

	---

	## Steps Taken
	1. Fine-tuned both models using a dataset of English-Japanese text pairs to improve translation accuracy.
	2. Tested the models on unseen data to measure their real-world performance.
	3. Applied optimizations like 4-bit quantization to reduce memory usage and make the models faster during evaluation.

	---

	## Results
	- mT5:
	- Performed well in handling translations and additional tasks like summarization and answering questions.
	- Showed versatility but sometimes lacked detailed accuracy for translations.

	- mBART:
	- Delivered precise and contextually accurate translations, especially for longer sentences.
	- Required fine-tuning but outperformed mT5 in translation-focused tasks.

	- Overall Conclusion:
	mT5 is a flexible model for multilingual tasks, while mBART ensures high-quality translations. Together, they balance versatility and accuracy, making them ideal for English-to-Japanese translations.

	---

	## How to Use
	1. Load the models from Hugging Face:
	- [mT5 Model on Hugging Face](https://huggingface.co/google/mt5-small)
	- [mBART Model on Hugging Face](https://huggingface.co/facebook/mbart-large-50)

	2. Fine-tune the models for your dataset using English-Japanese text pairs.
	3. Evaluate performance using BLEU Score, training loss, and perplexity.

	---

	## Future Work
	- Expand the dataset for better fine-tuning.
	- Explore task-specific fine-tuning for mT5 to improve its translation accuracy.
	- Optimize the models further for deployment in resource-constrained environments.

	---

	## References
	- [mT5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/2010.11934)
	- [mBART: Multilingual Denoising Pretraining for Neural Machine Translation](https://arxiv.org/abs/2001.08210)

	---