KMUTT-CPE35-thai-mt5base-summarizer
This repository contains a fine-tuned version of google/mt5-base for the task of Thai text summarization. The model was trained on 20,000 samples from the ThaiSum dataset and is part of a senior project in the Computer Engineering Department at King Mongkut’s University of Technology Thonburi (KMUTT).
Model Description
- Base model: google/mt5-base
- Task: Text Summarization (Thai)
- Fine-tuning dataset: ThaiSum (20k samples)
- Quantization: 8-bit
- Max sequence length: 512 tokens
Evaluation
The performance of the model was evaluated using the ROUGE metric, which is commonly used for assessing the quality of summarization tasks. The evaluation results on the test set are as follows:
- ROUGE-1: 0.4498
- ROUGE-2: 0.2551
- ROUGE-L: 0.4481
- ROUGE-Lsum: 0.4501
Training procedure
The following bitsandbytes
quantization config was used during training:
- quant_method: QuantizationMethod.BITS_AND_BYTES
- load_in_8bit: True
- load_in_4bit: False
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32
Framework versions
- PEFT 0.6.2
- Downloads last month
- 345
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for EXt1/KMUTT-CPE35-thai-mt5base-summarizer
Base model
google/mt5-base