File size: 4,890 Bytes
d82d43f
 
 
 
 
1199301
d82d43f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1199301
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
---
title: News Summarizer
emoji: πŸ‘
colorFrom: gray
colorTo: green
sdk: docker
sdk_version: 5.22.0
app_file: api.py
pinned: false
short_description: An app for summarizing news articles on orgs.
---

# News Summarization and Text-to-Speech Application

## Overview
This project is a web-based application that extracts key details from multiple news articles related to a given company, performs sentiment analysis, conducts a comparative analysis, and generates a text-to-speech (TTS) output in Hindi.

## Features
- **News Extraction**: Scrapes and displays at least 10 news articles from The New York Times and BBC.
- **Sentiment Analysis**: Categorizes articles into Positive, Negative, or Neutral sentiments.
- **Comparative Analysis**: Groups articles with most semantic similarity. Then compares the groups to derive insights on how a company's news coverage varies.
- **Text-to-Speech (TTS)**: Converts the summarized sentiment report into Hindi speech.
- **User Interface**: Provides a simple web-based interface using Gradio.
- **API Integration**: Implements FastAPI for backend communication.
- **Deployment**: Deployable on Hugging Face Spaces.

## Tech Stack
- **Frontend**: Gradio
- **Backend**: FastAPI
- **Scraping**: BeautifulSoup
- **NLP**: OpenAI GPT models, LangChain, Sentence Transformers
- **Sentiment Analysis**: Pre-trained Transformer model
- **Text-to-Speech**: Google TTS (gTTS)
- **Deployment**: Uvicorn, Hugging Face Spaces

---

## Installation and Setup

### 1. Clone the Repository
```bash
git clone https://github.com/Senzen18/News-Summarizer.git
cd News-Summarizer
```

### 2. Install Dependencies
Ensure you have Python 3.8+ installed. Then, run:
```bash
pip install -r requirements.txt
```

### 3. To run Fast API endpoints
Start the FastAPI backend:
```bash
uvicorn api:app --host 127.0.0.1 --port 8000 --reload
```

### 4. To run the both Gradio and Fast API
Start the FastAPI backend:
```bash
gradio app.py
```

### 5. Access the Application
Once started, access the Gradio UI at:
```
http://127.0.0.1:7860
```

---

## API Endpoints

### 1. Fetch News
**GET** `/news/{company_name}`
- Fetches the latest articles related to a company.
- **Example:** `/news/Tesla`

### 2. Analyze News Sentiment
**GET** `/analyze-news`
- Performs sentiment analysis on the extracted articles.

### 3. Compare News Articles
**POST** `/compare-news`
- Performs comparative analysis.
- **Request Body:**
```json
{
  "api_key": "your-openai-api-key",
  "model_name": "gpt-4o-mini",
  "company_name": "Tesla"
}
```

### 4. Generate Hindi Summary 
**GET** `/hindi-summary`
- Returns the summarized analysis in Hindi and stores the speech file.

---

## File Structure
```
β”œβ”€β”€ api.py               # FastAPI backend for news extraction, sentiment analysis, and comparison
β”œβ”€β”€ app.py               # Gradio frontend to interact with users
β”œβ”€β”€ llm_utils.py         # Handles OpenAI API calls for topic extraction and comparative analysis
β”œβ”€β”€ utils.py             # Utility functions for web scraping, sentiment analysis, and TTS
β”œβ”€β”€ requirements.txt     # Dependencies
└── README.md            # Project documentation
```

---

## Assumptions and Limitations
- Only extracts articles from The New York Times and BBC.
- Requires a valid OpenAI API key for sentiment analysis and comparison.
- Hindi speech output uses gTTS, which requires an internet connection.

---

## Deployment
This project can be deployed on Hugging Face Spaces. To deploy:
1. Push your repository to GitHub.
2. Follow [Hugging Face Spaces documentation](https://huggingface.co/docs/spaces) for deployment.

---

## Example Output
```json
{
  "Company": "Tesla",
  "Articles": [
    {
      "Title": "Tesla's New Model Breaks Sales Records",
      "Summary": "Tesla's latest EV sees record sales in Q3...",
      "Sentiment": "Positive",
      "Topics": ["Electric Vehicles", "Stock Market", "Innovation"]
    }
  ],
  "Comparative Sentiment Score": {
    "Sentiment Distribution": {"Positive": 1, "Negative": 1, "Neutral": 0},
    "Coverage Differences": [{
      "Comparison": "Article 1 highlights Tesla's strong sales, while Article 2 discusses regulatory issues.",
      "Impact": "Investors may react positively to growth news but stay cautious due to regulatory scrutiny."
    }],
    "Topic Overlap": {
      "Common Topics": ["Electric Vehicles"],
      "Unique Topics in Article 1": ["Stock Market", "Innovation"],
      "Unique Topics in Article 2": ["Regulations", "Autonomous Vehicles"]
    }
  },
  "Final Sentiment Analysis": "Tesla’s latest news coverage is mostly positive. Potential stock growth expected.",
  "Audio": "[Play Hindi Speech]"
}
```

---

## Contributing
Feel free to contribute by:
- Adding more news sources
- Improving the sentiment model
- Enhancing the UI

---

## Contact
For queries, reach out at [[email protected]].