File size: 4,890 Bytes
d82d43f 1199301 d82d43f 1199301 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
---
title: News Summarizer
emoji: π
colorFrom: gray
colorTo: green
sdk: docker
sdk_version: 5.22.0
app_file: api.py
pinned: false
short_description: An app for summarizing news articles on orgs.
---
# News Summarization and Text-to-Speech Application
## Overview
This project is a web-based application that extracts key details from multiple news articles related to a given company, performs sentiment analysis, conducts a comparative analysis, and generates a text-to-speech (TTS) output in Hindi.
## Features
- **News Extraction**: Scrapes and displays at least 10 news articles from The New York Times and BBC.
- **Sentiment Analysis**: Categorizes articles into Positive, Negative, or Neutral sentiments.
- **Comparative Analysis**: Groups articles with most semantic similarity. Then compares the groups to derive insights on how a company's news coverage varies.
- **Text-to-Speech (TTS)**: Converts the summarized sentiment report into Hindi speech.
- **User Interface**: Provides a simple web-based interface using Gradio.
- **API Integration**: Implements FastAPI for backend communication.
- **Deployment**: Deployable on Hugging Face Spaces.
## Tech Stack
- **Frontend**: Gradio
- **Backend**: FastAPI
- **Scraping**: BeautifulSoup
- **NLP**: OpenAI GPT models, LangChain, Sentence Transformers
- **Sentiment Analysis**: Pre-trained Transformer model
- **Text-to-Speech**: Google TTS (gTTS)
- **Deployment**: Uvicorn, Hugging Face Spaces
---
## Installation and Setup
### 1. Clone the Repository
```bash
git clone https://github.com/Senzen18/News-Summarizer.git
cd News-Summarizer
```
### 2. Install Dependencies
Ensure you have Python 3.8+ installed. Then, run:
```bash
pip install -r requirements.txt
```
### 3. To run Fast API endpoints
Start the FastAPI backend:
```bash
uvicorn api:app --host 127.0.0.1 --port 8000 --reload
```
### 4. To run the both Gradio and Fast API
Start the FastAPI backend:
```bash
gradio app.py
```
### 5. Access the Application
Once started, access the Gradio UI at:
```
http://127.0.0.1:7860
```
---
## API Endpoints
### 1. Fetch News
**GET** `/news/{company_name}`
- Fetches the latest articles related to a company.
- **Example:** `/news/Tesla`
### 2. Analyze News Sentiment
**GET** `/analyze-news`
- Performs sentiment analysis on the extracted articles.
### 3. Compare News Articles
**POST** `/compare-news`
- Performs comparative analysis.
- **Request Body:**
```json
{
"api_key": "your-openai-api-key",
"model_name": "gpt-4o-mini",
"company_name": "Tesla"
}
```
### 4. Generate Hindi Summary
**GET** `/hindi-summary`
- Returns the summarized analysis in Hindi and stores the speech file.
---
## File Structure
```
βββ api.py # FastAPI backend for news extraction, sentiment analysis, and comparison
βββ app.py # Gradio frontend to interact with users
βββ llm_utils.py # Handles OpenAI API calls for topic extraction and comparative analysis
βββ utils.py # Utility functions for web scraping, sentiment analysis, and TTS
βββ requirements.txt # Dependencies
βββ README.md # Project documentation
```
---
## Assumptions and Limitations
- Only extracts articles from The New York Times and BBC.
- Requires a valid OpenAI API key for sentiment analysis and comparison.
- Hindi speech output uses gTTS, which requires an internet connection.
---
## Deployment
This project can be deployed on Hugging Face Spaces. To deploy:
1. Push your repository to GitHub.
2. Follow [Hugging Face Spaces documentation](https://huggingface.co/docs/spaces) for deployment.
---
## Example Output
```json
{
"Company": "Tesla",
"Articles": [
{
"Title": "Tesla's New Model Breaks Sales Records",
"Summary": "Tesla's latest EV sees record sales in Q3...",
"Sentiment": "Positive",
"Topics": ["Electric Vehicles", "Stock Market", "Innovation"]
}
],
"Comparative Sentiment Score": {
"Sentiment Distribution": {"Positive": 1, "Negative": 1, "Neutral": 0},
"Coverage Differences": [{
"Comparison": "Article 1 highlights Tesla's strong sales, while Article 2 discusses regulatory issues.",
"Impact": "Investors may react positively to growth news but stay cautious due to regulatory scrutiny."
}],
"Topic Overlap": {
"Common Topics": ["Electric Vehicles"],
"Unique Topics in Article 1": ["Stock Market", "Innovation"],
"Unique Topics in Article 2": ["Regulations", "Autonomous Vehicles"]
}
},
"Final Sentiment Analysis": "Teslaβs latest news coverage is mostly positive. Potential stock growth expected.",
"Audio": "[Play Hindi Speech]"
}
```
---
## Contributing
Feel free to contribute by:
- Adding more news sources
- Improving the sentiment model
- Enhancing the UI
---
## Contact
For queries, reach out at [[email protected]]. |