BERT Fine-tuned for Sentiment Analysis on SST-2

Model Description

This model is a fine-tuned version of bert-base-uncased on the Stanford Sentiment Treebank v2 (SST-2) dataset. It was trained to perform binary sentiment classification (positive/negative) on movie review sentences.

Intended Uses & Limitations

Intended Uses

  • Sentiment analysis of short English texts, particularly movie reviews and similar content
  • Educational purposes for demonstrating fine-tuning of pre-trained language models
  • Baseline model for comparing more advanced sentiment analysis approaches

Limitations

  • The model is trained on movie reviews and may not generalize well to other domains (e.g., product reviews, social media posts)
  • Limited to English language text
  • Not optimized for very long texts (best for sentences or short paragraphs)
  • Binary classification only (positive/negative) without nuanced sentiment scores

Training and Evaluation Data

The model was fine-tuned on the SST-2 dataset from the GLUE benchmark:

  • Training set: 67,349 examples
  • Validation set: 872 examples
  • Test set: Not used in this fine-tuning

The SST-2 dataset consists of sentences from movie reviews with their associated binary sentiment labels.

For more information about the dataset, see the GLUE benchmark dataset card.

Training Procedure

Training Hyperparameters

  • Base model: bert-base-uncased
  • Epochs: 3
  • Training samples per second: 187.16
  • Training steps per second: 23.396
  • Total FLOPS: 3.08e+15
  • Hardware: NVIDIA A100 GPU

Training Results

Epoch Training Loss Validation Loss Accuracy
1 0.256300 0.427576 0.899083
2 0.169200 0.415616 0.903670
3 0.095600 0.426083 0.903670

Final training loss: 0.19818013534790577

Performance

Metrics

  • Accuracy on validation set: 90.37%

Model Limitations and Biases

This model may have inherited biases from its training data and pre-training corpus:

  • The SST-2 dataset primarily contains movie reviews which may not represent diverse perspectives
  • The model may perform differently across different demographic groups or cultural contexts
  • May have difficulty with sarcasm, irony, or culturally-specific expressions

Usage

from transformers import pipeline

sentiment_analyzer = pipeline("sentiment-analysis", model="radubutucelea23/bert_base_uncased_sst2")

texts = ["I really enjoyed this movie, the acting was superb.",
         "The plot was confusing and the characters were poorly developed."]

results = sentiment_analyzer(texts)
print(results)

Citation

If you use this model, please cite:

@inproceedings{socher-etal-2013-recursive,
    title = "Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank",
    author = "Socher, Richard and Perelygin, Alex and Wu, Jean and Chuang, Jason and Manning, Christopher D. and Ng, Andrew Y. and Potts, Christopher",
    booktitle = "Proceedings of EMNLP",
    year = "2013"
}

@article{devlin2019bert,
    title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
    author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
    journal={arXiv preprint arXiv:1810.04805},
    year={2018}
}

Further Information

  • Model Type: Text Classification
  • Language: English
  • License: MIT
  • Developer: Radu Butucelea
  • Organization: None
  • Last Updated: April 2, 2025

For questions and feedback, please contact me through my Hugging Face profile: radubutucelea23

Downloads last month
11
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for radubutucelea23/bert_base_uncased_sst2

Finetuned
(5129)
this model

Dataset used to train radubutucelea23/bert_base_uncased_sst2