xlmr-large-classifier-pinocchio_it_tra1-eng - MT/HT Classifier

This model is a fine-tuned version of FacebookAI/xlm-roberta-large for distinguishing between Machine Translated (MT) and Human Translated (HT) text (or HT1 and HT2 if using two different human translators).

Training data:

  • Train: 1490, for each label: 745
  • Validation: 164, for each label: 82
  • Test: 214, for each label: 107

Results on the held-out test set:

  • Accuracy: 0.9065
  • F1-Score: 0.9099
  • Precision: 0.8783
  • Recall: 0.9439

label mapping

Label MT: 0

Label PE: 1 (this is the human translator)

Info

Upload date: 2025-04-30 00:00

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("DanielSc4/xlmr-large-classifier-pinocchio_it_tra1-eng")
model = AutoModelForSequenceClassification.from_pretrained("DanielSc4/xlmr-large-classifier-pinocchio_it_tra1-eng")

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inp = tokenizer('This is a test', return_tensors='pt').to(device)
model = model.to(device)

out = model(**inp)

logits = out.logits
probs = logits.softmax(dim=-1)
pred = probs.argmax(dim=-1).item()
print("Predicted class: " + str(pred)) # 0 for MT, 1 for PE
Downloads last month
15
Safetensors
Model size
560M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support