Model Description

lcw99님이 만드신 lcw99/t5-base-korean-text-summary을 기반으로 Finetuning하여 만든 '뉴스 기사 요약 모델'입니다.

학습 데이터는 AIHub에서 제공하는 '문서요약 텍스트 (https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=97)'의 신문기사들을 사용하였습니다.

지속적으로 더 성능을 개선하여 좋은 모델로 고도화 하도록 하겠습니다.

Training Arguments

training_args = Seq2SeqTrainingArguments(
    evaluation_strategy="epoch",                    
    save_strategy="epoch",                          
    save_total_limit=2,                             
    warmup_steps=1000,                             
    learning_rate=5e-5,                             
    per_device_train_batch_size=16,                
    per_device_eval_batch_size=16,               
    num_train_epochs=3,                           
    weight_decay=0.01,                             
    predict_with_generate=True,                    
    fp16=True                               
)

Training Progress

Epoch Training Loss Validation Loss
1 0.604000 0.566043
2 0.577400 0.559071
3 0.553500 0.555571

실행환경

Window 10

NVIDIA GeForce RTX 3070, 8192 MiB

Framework Versions

Python: 3.10.14

PyTorch: 1.12.1

Transformers: 4.46.2

Datasets: 3.2.0

Tokenizers: 0.20.3

Downloads last month
36
Safetensors
Model size
276M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for onebeans/keT5-news-summarizer

Finetuned
(3)
this model

Spaces using onebeans/keT5-news-summarizer 2