KMUTT-CPE35-thai-mt5base-summarizer

This repository contains a fine-tuned version of google/mt5-base for the task of Thai text summarization. The model was trained on 20,000 samples from the ThaiSum dataset and is part of a senior project in the Computer Engineering Department at King Mongkut’s University of Technology Thonburi (KMUTT).

Model Description

Base model: google/mt5-base
Task: Text Summarization (Thai)
Fine-tuning dataset: ThaiSum (20k samples)
Quantization: 8-bit
Max sequence length: 512 tokens

Evaluation

The performance of the model was evaluated using the ROUGE metric, which is commonly used for assessing the quality of summarization tasks. The evaluation results on the test set are as follows:

ROUGE-1: 0.4498
ROUGE-2: 0.2551
ROUGE-L: 0.4481
ROUGE-Lsum: 0.4501

Training procedure

The following bitsandbytes quantization config was used during training:

quant_method: QuantizationMethod.BITS_AND_BYTES
load_in_8bit: True
load_in_4bit: False
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: fp4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: float32

Framework versions

PEFT 0.6.2

EXt1
/

KMUTT-CPE35-thai-mt5base-summarizer

KMUTT-CPE35-thai-mt5base-summarizer

Model Description

Evaluation

Training procedure

Framework versions

Model tree for EXt1/KMUTT-CPE35-thai-mt5base-summarizer