BERT Fine-tuned for Sentiment Analysis on SST-2

Model Description

This model is a fine-tuned version of bert-base-uncased on the Stanford Sentiment Treebank v2 (SST-2) dataset. It was trained to perform binary sentiment classification (positive/negative) on movie review sentences.

Intended Uses & Limitations

Intended Uses

Sentiment analysis of short English texts, particularly movie reviews and similar content
Educational purposes for demonstrating fine-tuning of pre-trained language models
Baseline model for comparing more advanced sentiment analysis approaches

Limitations

The model is trained on movie reviews and may not generalize well to other domains (e.g., product reviews, social media posts)
Limited to English language text
Not optimized for very long texts (best for sentences or short paragraphs)
Binary classification only (positive/negative) without nuanced sentiment scores

Training and Evaluation Data

The model was fine-tuned on the SST-2 dataset from the GLUE benchmark:

Training set: 67,349 examples
Validation set: 872 examples
Test set: Not used in this fine-tuning

The SST-2 dataset consists of sentences from movie reviews with their associated binary sentiment labels.

For more information about the dataset, see the GLUE benchmark dataset card.

Training Procedure

Training Hyperparameters

Base model: bert-base-uncased
Epochs: 3
Training samples per second: 187.16
Training steps per second: 23.396
Total FLOPS: 3.08e+15
Hardware: NVIDIA A100 GPU

Training Results

Epoch	Training Loss	Validation Loss	Accuracy
1	0.256300	0.427576	0.899083
2	0.169200	0.415616	0.903670
3	0.095600	0.426083	0.903670

Final training loss: 0.19818013534790577

Performance

Metrics

Accuracy on validation set: 90.37%

Model Limitations and Biases

This model may have inherited biases from its training data and pre-training corpus:

The SST-2 dataset primarily contains movie reviews which may not represent diverse perspectives
The model may perform differently across different demographic groups or cultural contexts
May have difficulty with sarcasm, irony, or culturally-specific expressions

Usage

from transformers import pipeline

sentiment_analyzer = pipeline("sentiment-analysis", model="radubutucelea23/bert_base_uncased_sst2")

texts = ["I really enjoyed this movie, the acting was superb.",
         "The plot was confusing and the characters were poorly developed."]

results = sentiment_analyzer(texts)
print(results)

Citation

If you use this model, please cite:

@inproceedings{socher-etal-2013-recursive,
    title = "Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank",
    author = "Socher, Richard and Perelygin, Alex and Wu, Jean and Chuang, Jason and Manning, Christopher D. and Ng, Andrew Y. and Potts, Christopher",
    booktitle = "Proceedings of EMNLP",
    year = "2013"
}

@article{devlin2019bert,
    title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
    author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
    journal={arXiv preprint arXiv:1810.04805},
    year={2018}
}

Further Information

Model Type: Text Classification
Language: English
License: MIT
Developer: Radu Butucelea
Organization: None
Last Updated: April 2, 2025

For questions and feedback, please contact me through my Hugging Face profile: radubutucelea23

radubutucelea23
/

bert_base_uncased_sst2