Saty01 commited on
Commit
6a9ed07
Β·
verified Β·
1 Parent(s): ac0c596

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -1
README.md CHANGED
@@ -1,6 +1,99 @@
1
- # FactChecker: Fake News Detection Web Application ![FactChecker Logo](build/logo.png)
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  FactChecker is a web application that detects fake news using various machine learning models.
4
  The system analyzes text input and predicts whether the content is likely to be real or fake news,
5
  providing confidence scores and visualizations to help users understand the results.
6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: FactChecker
3
+ emoji: πŸ“š
4
+ colorFrom: pink
5
+ colorTo: red
6
+ sdk: docker
7
+ pinned: false
8
+ license: mit
9
+ short_description: 'FactChecker: Fake News Detector'
10
+ ---
11
+
12
+ # <img src="build/logo.png" alt="FactChecker Logo" width="30" height="30"> FactChecker: Fake News Detection Web Application
13
+
14
 
15
  FactChecker is a web application that detects fake news using various machine learning models.
16
  The system analyzes text input and predicts whether the content is likely to be real or fake news,
17
  providing confidence scores and visualizations to help users understand the results.
18
 
19
+ ## Features
20
+
21
+ - **Multiple ML Models**: Choose between three different models or use all of them together:
22
+ - Logistic Regression (Accuracy: 90.42%, F1 Score: 87.62%)
23
+ - Random Forest (Accuracy: 90.83%, F1 Score: 87.52%)
24
+ - DistilBERT (Accuracy: 91.00%, F1 Score: 88.45%)
25
+
26
+ - **Ensemble Approach**: When selecting "All Models," the system combines predictions using a voting mechanism for more robust results
27
+ - **Real-time Analysis**: Instantly assess the credibility of news articles or statements
28
+ - **Confidence Scores**: View the model's level of certainty in its predictions
29
+ - **Visual Interface**: Color-coded results (green for real, red for fake) for intuitive understanding
30
+
31
+ ## Technology Stack
32
+
33
+ ### Backend
34
+ - Python 3.11 with Flask 2.0.1
35
+ - NLTK 3.9.1 for natural language processing
36
+ - Scikit-learn 1.6.1 for traditional machine learning models
37
+ - PyTorch 2.6.0 and Transformers 4.49.0 for the DistilBERT model
38
+ - Gunicorn 20.1.0 for production deployment
39
+ **Verify the versions before running the BACKEND**
40
+
41
+ ### Frontend
42
+ - React.js for the user interface
43
+ - Modern JavaScript (ES6+)
44
+ - CSS for styling
45
+
46
+ ### Data Processing
47
+ - Pandas and NumPy for data manipulation
48
+ - TF-IDF Vectorization for feature extraction
49
+ - Regular expressions for text cleaning
50
+
51
+ ## Project Structure
52
+ FactChecker/
53
+ β”œβ”€β”€ build/ # React build files(compiled frontend)
54
+ β”‚ β”œβ”€β”€ static/
55
+ β”‚ β”‚ β”œβ”€β”€ css/ # Compiled CSS
56
+ β”‚ β”‚ └── js/ # Compiled JavaScript
57
+ β”‚ β”œβ”€β”€ asset-manifest.json
58
+ β”‚ β”œβ”€β”€ index.html # Main HTML file
59
+ β”‚ β”œβ”€β”€ logo.ico
60
+ β”‚ β”œβ”€β”€ logo.png
61
+ β”‚ └── manifest.json
62
+ β”œβ”€β”€ model_training/ # Model training materials
63
+ β”‚ β”œβ”€β”€ visualizations/ # Generated visualization images
64
+ β”‚ └── model_training.ipynb # Jupyter notebook for model training
65
+ β”œβ”€β”€ models/ # Saved ML models
66
+ β”‚ β”œβ”€β”€ tfidf_vectorizer.pkl # TF-IDF vectorizer
67
+ β”‚ β”œβ”€β”€ lr_model.pkl # Logistic Regression model
68
+ β”‚ β”œβ”€β”€ rf_model.pkl # Random Forest model
69
+ β”‚ └── distilbert_model.pt # DistilBERT model
70
+ β”œβ”€β”€ .gitattributes
71
+ β”œβ”€β”€ Dockerfile # Docker configuration
72
+ β”œβ”€β”€ README.md
73
+ β”œβ”€β”€ app.py # Flask application
74
+ └── requirements.txt # Python dependencies
75
+
76
+ ### Steps
77
+
78
+ #### For Backend:
79
+ 1. Clone the repository
80
+ 2. Create a virtual environment and install the dependencies.
81
+ 1. pip install -r requirements.txt
82
+ 3. Download NLTK resources:
83
+ 1. python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords'); nltk.download('wordnet')"
84
+ 4. Run the application
85
+ 1. python app.py
86
+
87
+ #### For Frontend:
88
+ 1. Install dependencies:
89
+ 1. npm install
90
+ 3. Build the frontend:
91
+ 1. npm run build
92
+
93
+ #### Model Training
94
+ To retrain the models:
95
+ 1. Upload the notebook in Google Colab.
96
+ 2. Download the ISOT(true.csv, fake.csv) datasets and upload it to the google drive.
97
+ 3. Change the runtime type to ideally run GPU instance.
98
+ 4. Activate the runtime.
99
+ 5. Run the Cells.