HeR-T: Herbarium specimen label Recognition Transformer

πŸ“ƒ Paper

Application of computer vision to the automated extraction of metadata from natural history specimen labels: A case study on herbarium specimens (Under Review)

πŸ’ Authors

Zacchigna, Jacopo; Liu, Weiwei; Pellegrino, Felice Andrea; Peron, Adriano; Roma-Marzio, Francesco; Peruzzi, Lorenzo; Martellos, Stefano

πŸš€ Overview

HeR-T (Herbarium specimen label Recognition Transformer) is a fine-tuned vision-language model designed for automated metadata extraction of history specimen labels, especially herbarium specimen labels. It leverages Donut-base and has been fine-tuned with 55,089 herbarium specimen images from the Herbarium of the University of Pisa (international acronym PI).

πŸ”₯ Features

  • Fine-tuned on specimen images from the Herbarium of the University of Pisa for automated metadata extraction of history specimen labels
  • Supports image inputs with labels containing printed, handwritten, or mixed-format texts
  • Evaluation: Tree Edit Distance (TED) accuracy score with the formula max(0, 1βˆ’TED(pr, gt)/TED(Ο†, gt)), where gt, pr, and Ο† stand for ground truth, prediction, and empty trees respectively
  • Pre-trained weights are loaded from Donut-base (naver-clova-ix/donut-base)
Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for elderprince/HeR-T

Finetuned
(421)
this model

Space using elderprince/HeR-T 1