HeR-T: Herbarium specimen label Recognition Transformer
π Paper
Application of computer vision to the automated extraction of metadata from natural history specimen labels: A case study on herbarium specimens (Under Review)
π Authors
Zacchigna, Jacopo; Liu, Weiwei; Pellegrino, Felice Andrea; Peron, Adriano; Roma-Marzio, Francesco; Peruzzi, Lorenzo; Martellos, Stefano
π Overview
HeR-T (Herbarium specimen label Recognition Transformer) is a fine-tuned vision-language model designed for automated metadata extraction of history specimen labels, especially herbarium specimen labels. It leverages Donut-base and has been fine-tuned with 55,089 herbarium specimen images from the Herbarium of the University of Pisa (international acronym PI).
π₯ Features
- Fine-tuned on specimen images from the Herbarium of the University of Pisa for automated metadata extraction of history specimen labels
- Supports image inputs with labels containing printed, handwritten, or mixed-format texts
- Evaluation: Tree Edit Distance (TED) accuracy score with the formula max(0, 1βTED(pr, gt)/TED(Ο, gt)), where gt, pr, and Ο stand for ground truth, prediction, and empty trees respectively
- Pre-trained weights are loaded from Donut-base (naver-clova-ix/donut-base)
- Downloads last month
- 16
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for elderprince/HeR-T
Base model
naver-clova-ix/donut-base