Model Card for Flan-T5 Base Token Classifier (NER: LOC, ORG, PER)
This model is a fine-tuned encoder-only version of google/flan-t5-base
for token-level Named Entity Recognition (NER). It predicts entity labels (e.g., LOC, ORG, PER) by classifying individual tokens in a prompting setup using <TSTART>
and <TEND>
markers.
Model Details
Model Description
This model is based on the encoder of the T5 architecture and has been fine-tuned for single-token classification using a prompt-driven approach. Given a sentence, one token is wrapped with <TSTART>
and <TEND>
, and the model predicts the corresponding entity class.
- Developed by: pepegiallo
- Model type: Encoder-only token classifier
- Language(s) (NLP): en, de, fr, it, es
- License: MIT
- Finetuned from model: google/flan-t5-base
Uses
Direct Use
You can use this model to classify named entities (PER, ORG, LOC, or O) one token at a time. This approach is suitable for tasks such as:
- PII detection
- Privacy-preserving document redaction
- Legal or medical text anonymization
Out-of-Scope Use
- Full-sequence tagging (this model is optimized for classifying one token at a time)
- Multi-token entity recognition without aggregation logic
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load tokenizer and model
model = AutoModelForSequenceClassification.from_pretrained("pepegiallo/flan-t5-base_ner")
tokenizer = AutoTokenizer.from_pretrained("pepegiallo/flan-t5-base_ner")
# Helper: wrap a token with <TSTART> and <TEND>
def wrap_token(text, target_token, tstart="<TSTART>", tend="<TEND>"):
return text.replace(target_token, f"{tstart} {target_token} {tend}")
text = "The headquarters of Microsoft is in Redmond."
target_token = "Microsoft"
prompt = "classify token in: " + wrap_token(text, target_token)
inputs = tokenizer(prompt, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
outputs = model(**inputs)
label_id = torch.argmax(outputs.logits, dim=-1).item()
id2label = {0: "LOC", 1: "ORG", 2: "PER", 3: "O"}
print("Predicted entity:", id2label[label_id])
Training Details
The model was fine-tuned for 3 epochs on a multilingual, balanced dataset combining:
wikiann
(unimelb-nlp)open-pii-masking-500k-ai4privacy
- Custom annotated examples with
<TSTART>
/<TEND>
tags
Training Hyperparameters
- Model: google/flan-t5-base (encoder only)
- Batch size: 128
- Max input length: 128
- Optimizer: AdamW
- Learning rate: 3e-5
- Epochs: 3
Evaluation
Metrics
The model was evaluated using the following metrics:
- Accuracy
- Precision (Macro)
- Recall (Macro)
- F1 Score (Macro)
Results
Epoch | Training Loss | Validation Loss | Accuracy | Precision (Macro) | Recall (Macro) | F1 (Macro) |
---|---|---|---|---|---|---|
1 | 0.1702 | 0.1504 | 95.21% | 0.9521 | 0.9521 | 0.9521 |
2 | 0.1444 | 0.1310 | 95.89% | 0.9588 | 0.9589 | 0.9589 |
3 | 0.1290 | 0.1246 | 96.14% | 0.9614 | 0.9614 | 0.9614 |
Environmental Impact
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: [More Information Needed]
- Compute Region: [More Information Needed]
Technical Specifications
- Architecture: T5 Encoder + Dense Classification Head
- Precision: fp32
- Framework: PyTorch + Huggingface Transformers
Citation [optional]
@misc{flan-t5-ner,
title={Token Classification with Flan-T5 Encoder},
author={pepegiallo},
year={2025},
howpublished={\url{https://huggingface.co/pepegiallo/flan-t5-base_ner}}
}
Model Card Contact
For questions, contact: https://huggingface.co/pepegiallo
- Downloads last month
- 15
Model tree for pepegiallo/flan-t5-base_ner
Base model
google/flan-t5-base