KMUTT-CPE35-thai-mt5base-summarizer

This repository contains a fine-tuned version of google/mt5-base for the task of Thai text summarization. The model was trained on 20,000 samples from the ThaiSum dataset and is part of a senior project in the Computer Engineering Department at King Mongkut’s University of Technology Thonburi (KMUTT).

Model Description

  • Base model: google/mt5-base
  • Task: Text Summarization (Thai)
  • Fine-tuning dataset: ThaiSum (20k samples)
  • Quantization: 8-bit
  • Max sequence length: 512 tokens

Evaluation

The performance of the model was evaluated using the ROUGE metric, which is commonly used for assessing the quality of summarization tasks. The evaluation results on the test set are as follows:

  • ROUGE-1: 0.4498
  • ROUGE-2: 0.2551
  • ROUGE-L: 0.4481
  • ROUGE-Lsum: 0.4501

Training procedure

The following bitsandbytes quantization config was used during training:

  • quant_method: QuantizationMethod.BITS_AND_BYTES
  • load_in_8bit: True
  • load_in_4bit: False
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: fp4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float32

Framework versions

  • PEFT 0.6.2
Downloads last month
345
Safetensors
Model size
586M params
Tensor type
F32
·
I8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for EXt1/KMUTT-CPE35-thai-mt5base-summarizer

Base model

google/mt5-base
Adapter
(26)
this model