Transformers
Safetensors
flan-t5-classifier

Model Card for Flan-T5 Base Token Classifier (NER: LOC, ORG, PER)

This model is a fine-tuned encoder-only version of google/flan-t5-base for token-level Named Entity Recognition (NER). It predicts entity labels (e.g., LOC, ORG, PER) by classifying individual tokens in a prompting setup using <TSTART> and <TEND> markers.


Model Details

Model Description

This model is based on the encoder of the T5 architecture and has been fine-tuned for single-token classification using a prompt-driven approach. Given a sentence, one token is wrapped with <TSTART> and <TEND>, and the model predicts the corresponding entity class.

  • Developed by: pepegiallo
  • Model type: Encoder-only token classifier
  • Language(s) (NLP): en, de, fr, it, es
  • License: MIT
  • Finetuned from model: google/flan-t5-base

Uses

Direct Use

You can use this model to classify named entities (PER, ORG, LOC, or O) one token at a time. This approach is suitable for tasks such as:

  • PII detection
  • Privacy-preserving document redaction
  • Legal or medical text anonymization

Out-of-Scope Use

  • Full-sequence tagging (this model is optimized for classifying one token at a time)
  • Multi-token entity recognition without aggregation logic

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load tokenizer and model
model = AutoModelForSequenceClassification.from_pretrained("pepegiallo/flan-t5-base_ner")
tokenizer = AutoTokenizer.from_pretrained("pepegiallo/flan-t5-base_ner")

# Helper: wrap a token with <TSTART> and <TEND>
def wrap_token(text, target_token, tstart="<TSTART>", tend="<TEND>"):
    return text.replace(target_token, f"{tstart} {target_token} {tend}")

text = "The headquarters of Microsoft is in Redmond."
target_token = "Microsoft"
prompt = "classify token in: " + wrap_token(text, target_token)

inputs = tokenizer(prompt, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
outputs = model(**inputs)
label_id = torch.argmax(outputs.logits, dim=-1).item()

id2label = {0: "LOC", 1: "ORG", 2: "PER", 3: "O"}
print("Predicted entity:", id2label[label_id])

Training Details

The model was fine-tuned for 3 epochs on a multilingual, balanced dataset combining:

  • wikiann (unimelb-nlp)
  • open-pii-masking-500k-ai4privacy
  • Custom annotated examples with <TSTART> / <TEND> tags

Training Hyperparameters

  • Model: google/flan-t5-base (encoder only)
  • Batch size: 128
  • Max input length: 128
  • Optimizer: AdamW
  • Learning rate: 3e-5
  • Epochs: 3

Evaluation

Metrics

The model was evaluated using the following metrics:

  • Accuracy
  • Precision (Macro)
  • Recall (Macro)
  • F1 Score (Macro)

Results

Epoch Training Loss Validation Loss Accuracy Precision (Macro) Recall (Macro) F1 (Macro)
1 0.1702 0.1504 95.21% 0.9521 0.9521 0.9521
2 0.1444 0.1310 95.89% 0.9588 0.9589 0.9589
3 0.1290 0.1246 96.14% 0.9614 0.9614 0.9614

Environmental Impact

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]

Technical Specifications

  • Architecture: T5 Encoder + Dense Classification Head
  • Precision: fp32
  • Framework: PyTorch + Huggingface Transformers

Citation [optional]

@misc{flan-t5-ner,
  title={Token Classification with Flan-T5 Encoder},
  author={pepegiallo},
  year={2025},
  howpublished={\url{https://huggingface.co/pepegiallo/flan-t5-base_ner}}
}

Model Card Contact

For questions, contact: https://huggingface.co/pepegiallo

Downloads last month
15
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pepegiallo/flan-t5-base_ner

Finetuned
(713)
this model

Datasets used to train pepegiallo/flan-t5-base_ner

Space using pepegiallo/flan-t5-base_ner 1