Senzen commited on
Commit
d82d43f
Β·
verified Β·
1 Parent(s): c576592

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +168 -168
README.md CHANGED
@@ -1,168 +1,168 @@
1
- ---
2
- title: News Summarizer
3
- emoji: πŸ‘
4
- colorFrom: gray
5
- colorTo: green
6
- sdk: gradio
7
- sdk_version: 5.22.0
8
- app_file: app.py
9
- pinned: false
10
- short_description: An app for summarizing news articles on orgs.
11
- ---
12
-
13
- # News Summarization and Text-to-Speech Application
14
-
15
- ## Overview
16
- This project is a web-based application that extracts key details from multiple news articles related to a given company, performs sentiment analysis, conducts a comparative analysis, and generates a text-to-speech (TTS) output in Hindi.
17
-
18
- ## Features
19
- - **News Extraction**: Scrapes and displays at least 10 news articles from The New York Times and BBC.
20
- - **Sentiment Analysis**: Categorizes articles into Positive, Negative, or Neutral sentiments.
21
- - **Comparative Analysis**: Groups articles with most semantic similarity. Then compares the groups to derive insights on how a company's news coverage varies.
22
- - **Text-to-Speech (TTS)**: Converts the summarized sentiment report into Hindi speech.
23
- - **User Interface**: Provides a simple web-based interface using Gradio.
24
- - **API Integration**: Implements FastAPI for backend communication.
25
- - **Deployment**: Deployable on Hugging Face Spaces.
26
-
27
- ## Tech Stack
28
- - **Frontend**: Gradio
29
- - **Backend**: FastAPI
30
- - **Scraping**: BeautifulSoup
31
- - **NLP**: OpenAI GPT models, LangChain, Sentence Transformers
32
- - **Sentiment Analysis**: Pre-trained Transformer model
33
- - **Text-to-Speech**: Google TTS (gTTS)
34
- - **Deployment**: Uvicorn, Hugging Face Spaces
35
-
36
- ---
37
-
38
- ## Installation and Setup
39
-
40
- ### 1. Clone the Repository
41
- ```bash
42
- git clone https://github.com/Senzen18/News-Summarizer.git
43
- cd News-Summarizer
44
- ```
45
-
46
- ### 2. Install Dependencies
47
- Ensure you have Python 3.8+ installed. Then, run:
48
- ```bash
49
- pip install -r requirements.txt
50
- ```
51
-
52
- ### 3. To run Fast API endpoints
53
- Start the FastAPI backend:
54
- ```bash
55
- uvicorn api:app --host 127.0.0.1 --port 8000 --reload
56
- ```
57
-
58
- ### 4. To run the both Gradio and Fast API
59
- Start the FastAPI backend:
60
- ```bash
61
- gradio app.py
62
- ```
63
-
64
- ### 5. Access the Application
65
- Once started, access the Gradio UI at:
66
- ```
67
- http://127.0.0.1:7860
68
- ```
69
-
70
- ---
71
-
72
- ## API Endpoints
73
-
74
- ### 1. Fetch News
75
- **GET** `/news/{company_name}`
76
- - Fetches the latest articles related to a company.
77
- - **Example:** `/news/Tesla`
78
-
79
- ### 2. Analyze News Sentiment
80
- **GET** `/analyze-news`
81
- - Performs sentiment analysis on the extracted articles.
82
-
83
- ### 3. Compare News Articles
84
- **POST** `/compare-news`
85
- - Performs comparative analysis.
86
- - **Request Body:**
87
- ```json
88
- {
89
- "api_key": "your-openai-api-key",
90
- "model_name": "gpt-4o-mini",
91
- "company_name": "Tesla"
92
- }
93
- ```
94
-
95
- ### 4. Generate Hindi Summary
96
- **GET** `/hindi-summary`
97
- - Returns the summarized analysis in Hindi and stores the speech file.
98
-
99
- ---
100
-
101
- ## File Structure
102
- ```
103
- β”œβ”€β”€ api.py # FastAPI backend for news extraction, sentiment analysis, and comparison
104
- β”œβ”€β”€ app.py # Gradio frontend to interact with users
105
- β”œβ”€β”€ llm_utils.py # Handles OpenAI API calls for topic extraction and comparative analysis
106
- β”œβ”€β”€ utils.py # Utility functions for web scraping, sentiment analysis, and TTS
107
- β”œβ”€β”€ requirements.txt # Dependencies
108
- └── README.md # Project documentation
109
- ```
110
-
111
- ---
112
-
113
- ## Assumptions and Limitations
114
- - Only extracts articles from The New York Times and BBC.
115
- - Requires a valid OpenAI API key for sentiment analysis and comparison.
116
- - Hindi speech output uses gTTS, which requires an internet connection.
117
-
118
- ---
119
-
120
- ## Deployment
121
- This project can be deployed on Hugging Face Spaces. To deploy:
122
- 1. Push your repository to GitHub.
123
- 2. Follow [Hugging Face Spaces documentation](https://huggingface.co/docs/spaces) for deployment.
124
-
125
- ---
126
-
127
- ## Example Output
128
- ```json
129
- {
130
- "Company": "Tesla",
131
- "Articles": [
132
- {
133
- "Title": "Tesla's New Model Breaks Sales Records",
134
- "Summary": "Tesla's latest EV sees record sales in Q3...",
135
- "Sentiment": "Positive",
136
- "Topics": ["Electric Vehicles", "Stock Market", "Innovation"]
137
- }
138
- ],
139
- "Comparative Sentiment Score": {
140
- "Sentiment Distribution": {"Positive": 1, "Negative": 1, "Neutral": 0},
141
- "Coverage Differences": [{
142
- "Comparison": "Article 1 highlights Tesla's strong sales, while Article 2 discusses regulatory issues.",
143
- "Impact": "Investors may react positively to growth news but stay cautious due to regulatory scrutiny."
144
- }],
145
- "Topic Overlap": {
146
- "Common Topics": ["Electric Vehicles"],
147
- "Unique Topics in Article 1": ["Stock Market", "Innovation"],
148
- "Unique Topics in Article 2": ["Regulations", "Autonomous Vehicles"]
149
- }
150
- },
151
- "Final Sentiment Analysis": "Tesla’s latest news coverage is mostly positive. Potential stock growth expected.",
152
- "Audio": "[Play Hindi Speech]"
153
- }
154
- ```
155
-
156
- ---
157
-
158
- ## Contributing
159
- Feel free to contribute by:
160
- - Adding more news sources
161
- - Improving the sentiment model
162
- - Enhancing the UI
163
-
164
- ---
165
-
166
- ## Contact
167
- For queries, reach out at [[email protected]].
168
-
 
1
+ ---
2
+ title: News Summarizer
3
+ emoji: πŸ‘
4
+ colorFrom: gray
5
+ colorTo: green
6
+ sdk: gradio
7
+ sdk_version: 5.22.0
8
+ app_file: api.py
9
+ pinned: false
10
+ short_description: An app for summarizing news articles on orgs.
11
+ ---
12
+
13
+ # News Summarization and Text-to-Speech Application
14
+
15
+ ## Overview
16
+ This project is a web-based application that extracts key details from multiple news articles related to a given company, performs sentiment analysis, conducts a comparative analysis, and generates a text-to-speech (TTS) output in Hindi.
17
+
18
+ ## Features
19
+ - **News Extraction**: Scrapes and displays at least 10 news articles from The New York Times and BBC.
20
+ - **Sentiment Analysis**: Categorizes articles into Positive, Negative, or Neutral sentiments.
21
+ - **Comparative Analysis**: Groups articles with most semantic similarity. Then compares the groups to derive insights on how a company's news coverage varies.
22
+ - **Text-to-Speech (TTS)**: Converts the summarized sentiment report into Hindi speech.
23
+ - **User Interface**: Provides a simple web-based interface using Gradio.
24
+ - **API Integration**: Implements FastAPI for backend communication.
25
+ - **Deployment**: Deployable on Hugging Face Spaces.
26
+
27
+ ## Tech Stack
28
+ - **Frontend**: Gradio
29
+ - **Backend**: FastAPI
30
+ - **Scraping**: BeautifulSoup
31
+ - **NLP**: OpenAI GPT models, LangChain, Sentence Transformers
32
+ - **Sentiment Analysis**: Pre-trained Transformer model
33
+ - **Text-to-Speech**: Google TTS (gTTS)
34
+ - **Deployment**: Uvicorn, Hugging Face Spaces
35
+
36
+ ---
37
+
38
+ ## Installation and Setup
39
+
40
+ ### 1. Clone the Repository
41
+ ```bash
42
+ git clone https://github.com/Senzen18/News-Summarizer.git
43
+ cd News-Summarizer
44
+ ```
45
+
46
+ ### 2. Install Dependencies
47
+ Ensure you have Python 3.8+ installed. Then, run:
48
+ ```bash
49
+ pip install -r requirements.txt
50
+ ```
51
+
52
+ ### 3. To run Fast API endpoints
53
+ Start the FastAPI backend:
54
+ ```bash
55
+ uvicorn api:app --host 127.0.0.1 --port 8000 --reload
56
+ ```
57
+
58
+ ### 4. To run the both Gradio and Fast API
59
+ Start the FastAPI backend:
60
+ ```bash
61
+ gradio app.py
62
+ ```
63
+
64
+ ### 5. Access the Application
65
+ Once started, access the Gradio UI at:
66
+ ```
67
+ http://127.0.0.1:7860
68
+ ```
69
+
70
+ ---
71
+
72
+ ## API Endpoints
73
+
74
+ ### 1. Fetch News
75
+ **GET** `/news/{company_name}`
76
+ - Fetches the latest articles related to a company.
77
+ - **Example:** `/news/Tesla`
78
+
79
+ ### 2. Analyze News Sentiment
80
+ **GET** `/analyze-news`
81
+ - Performs sentiment analysis on the extracted articles.
82
+
83
+ ### 3. Compare News Articles
84
+ **POST** `/compare-news`
85
+ - Performs comparative analysis.
86
+ - **Request Body:**
87
+ ```json
88
+ {
89
+ "api_key": "your-openai-api-key",
90
+ "model_name": "gpt-4o-mini",
91
+ "company_name": "Tesla"
92
+ }
93
+ ```
94
+
95
+ ### 4. Generate Hindi Summary
96
+ **GET** `/hindi-summary`
97
+ - Returns the summarized analysis in Hindi and stores the speech file.
98
+
99
+ ---
100
+
101
+ ## File Structure
102
+ ```
103
+ β”œβ”€β”€ api.py # FastAPI backend for news extraction, sentiment analysis, and comparison
104
+ β”œβ”€β”€ app.py # Gradio frontend to interact with users
105
+ β”œβ”€β”€ llm_utils.py # Handles OpenAI API calls for topic extraction and comparative analysis
106
+ β”œβ”€β”€ utils.py # Utility functions for web scraping, sentiment analysis, and TTS
107
+ β”œβ”€β”€ requirements.txt # Dependencies
108
+ └── README.md # Project documentation
109
+ ```
110
+
111
+ ---
112
+
113
+ ## Assumptions and Limitations
114
+ - Only extracts articles from The New York Times and BBC.
115
+ - Requires a valid OpenAI API key for sentiment analysis and comparison.
116
+ - Hindi speech output uses gTTS, which requires an internet connection.
117
+
118
+ ---
119
+
120
+ ## Deployment
121
+ This project can be deployed on Hugging Face Spaces. To deploy:
122
+ 1. Push your repository to GitHub.
123
+ 2. Follow [Hugging Face Spaces documentation](https://huggingface.co/docs/spaces) for deployment.
124
+
125
+ ---
126
+
127
+ ## Example Output
128
+ ```json
129
+ {
130
+ "Company": "Tesla",
131
+ "Articles": [
132
+ {
133
+ "Title": "Tesla's New Model Breaks Sales Records",
134
+ "Summary": "Tesla's latest EV sees record sales in Q3...",
135
+ "Sentiment": "Positive",
136
+ "Topics": ["Electric Vehicles", "Stock Market", "Innovation"]
137
+ }
138
+ ],
139
+ "Comparative Sentiment Score": {
140
+ "Sentiment Distribution": {"Positive": 1, "Negative": 1, "Neutral": 0},
141
+ "Coverage Differences": [{
142
+ "Comparison": "Article 1 highlights Tesla's strong sales, while Article 2 discusses regulatory issues.",
143
+ "Impact": "Investors may react positively to growth news but stay cautious due to regulatory scrutiny."
144
+ }],
145
+ "Topic Overlap": {
146
+ "Common Topics": ["Electric Vehicles"],
147
+ "Unique Topics in Article 1": ["Stock Market", "Innovation"],
148
+ "Unique Topics in Article 2": ["Regulations", "Autonomous Vehicles"]
149
+ }
150
+ },
151
+ "Final Sentiment Analysis": "Tesla’s latest news coverage is mostly positive. Potential stock growth expected.",
152
+ "Audio": "[Play Hindi Speech]"
153
+ }
154
+ ```
155
+
156
+ ---
157
+
158
+ ## Contributing
159
+ Feel free to contribute by:
160
+ - Adding more news sources
161
+ - Improving the sentiment model
162
+ - Enhancing the UI
163
+
164
+ ---
165
+
166
+ ## Contact
167
+ For queries, reach out at [[email protected]].
168
+