Model Card for Flan-T5 Base Token Classifier (NER: LOC, ORG, PER)

This model is a fine-tuned encoder-only version of google/flan-t5-base for token-level Named Entity Recognition (NER). It predicts entity labels (e.g., LOC, ORG, PER) by classifying individual tokens in a prompting setup using <TSTART> and <TEND> markers.

Model Details

Model Description

This model is based on the encoder of the T5 architecture and has been fine-tuned for single-token classification using a prompt-driven approach. Given a sentence, one token is wrapped with <TSTART> and <TEND>, and the model predicts the corresponding entity class.

Developed by: pepegiallo
Model type: Encoder-only token classifier
Language(s) (NLP): en, de, fr, it, es
License: MIT
Finetuned from model: google/flan-t5-base

Uses

Direct Use

You can use this model to classify named entities (PER, ORG, LOC, or O) one token at a time. This approach is suitable for tasks such as:

PII detection
Privacy-preserving document redaction
Legal or medical text anonymization

Out-of-Scope Use

Full-sequence tagging (this model is optimized for classifying one token at a time)
Multi-token entity recognition without aggregation logic

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load tokenizer and model
model = AutoModelForSequenceClassification.from_pretrained("pepegiallo/flan-t5-base_ner")
tokenizer = AutoTokenizer.from_pretrained("pepegiallo/flan-t5-base_ner")

# Helper: wrap a token with <TSTART> and <TEND>
def wrap_token(text, target_token, tstart="<TSTART>", tend="<TEND>"):
    return text.replace(target_token, f"{tstart} {target_token} {tend}")

text = "The headquarters of Microsoft is in Redmond."
target_token = "Microsoft"
prompt = "classify token in: " + wrap_token(text, target_token)

inputs = tokenizer(prompt, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
outputs = model(**inputs)
label_id = torch.argmax(outputs.logits, dim=-1).item()

id2label = {0: "LOC", 1: "ORG", 2: "PER", 3: "O"}
print("Predicted entity:", id2label[label_id])

Training Details

The model was fine-tuned for 3 epochs on a multilingual, balanced dataset combining:

wikiann (unimelb-nlp)
open-pii-masking-500k-ai4privacy
Custom annotated examples with <TSTART> / <TEND> tags

Training Hyperparameters

Model: google/flan-t5-base (encoder only)
Batch size: 128
Max input length: 128
Optimizer: AdamW
Learning rate: 3e-5
Epochs: 3

Evaluation

Metrics

The model was evaluated using the following metrics:

Accuracy
Precision (Macro)
Recall (Macro)
F1 Score (Macro)

Results

Epoch	Training Loss	Validation Loss	Accuracy	Precision (Macro)	Recall (Macro)	F1 (Macro)
1	0.1702	0.1504	95.21%	0.9521	0.9521	0.9521
2	0.1444	0.1310	95.89%	0.9588	0.9589	0.9589
3	0.1290	0.1246	96.14%	0.9614	0.9614	0.9614

Environmental Impact

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]

Technical Specifications

Architecture: T5 Encoder + Dense Classification Head
Precision: fp32
Framework: PyTorch + Huggingface Transformers

Citation [optional]

@misc{flan-t5-ner,
  title={Token Classification with Flan-T5 Encoder},
  author={pepegiallo},
  year={2025},
  howpublished={\url{https://huggingface.co/pepegiallo/flan-t5-base_ner}}
}

Model Card Contact

For questions, contact: https://huggingface.co/pepegiallo

pepegiallo
/

flan-t5-base_ner