Back-End / README.md
Senzen's picture
Update README.md
1199301 verified
metadata
title: News Summarizer
emoji: πŸ‘
colorFrom: gray
colorTo: green
sdk: docker
sdk_version: 5.22.0
app_file: api.py
pinned: false
short_description: An app for summarizing news articles on orgs.

News Summarization and Text-to-Speech Application

Overview

This project is a web-based application that extracts key details from multiple news articles related to a given company, performs sentiment analysis, conducts a comparative analysis, and generates a text-to-speech (TTS) output in Hindi.

Features

  • News Extraction: Scrapes and displays at least 10 news articles from The New York Times and BBC.
  • Sentiment Analysis: Categorizes articles into Positive, Negative, or Neutral sentiments.
  • Comparative Analysis: Groups articles with most semantic similarity. Then compares the groups to derive insights on how a company's news coverage varies.
  • Text-to-Speech (TTS): Converts the summarized sentiment report into Hindi speech.
  • User Interface: Provides a simple web-based interface using Gradio.
  • API Integration: Implements FastAPI for backend communication.
  • Deployment: Deployable on Hugging Face Spaces.

Tech Stack

  • Frontend: Gradio
  • Backend: FastAPI
  • Scraping: BeautifulSoup
  • NLP: OpenAI GPT models, LangChain, Sentence Transformers
  • Sentiment Analysis: Pre-trained Transformer model
  • Text-to-Speech: Google TTS (gTTS)
  • Deployment: Uvicorn, Hugging Face Spaces

Installation and Setup

1. Clone the Repository

git clone https://github.com/Senzen18/News-Summarizer.git
cd News-Summarizer

2. Install Dependencies

Ensure you have Python 3.8+ installed. Then, run:

pip install -r requirements.txt

3. To run Fast API endpoints

Start the FastAPI backend:

uvicorn api:app --host 127.0.0.1 --port 8000 --reload

4. To run the both Gradio and Fast API

Start the FastAPI backend:

gradio app.py

5. Access the Application

Once started, access the Gradio UI at:

http://127.0.0.1:7860

API Endpoints

1. Fetch News

GET /news/{company_name}

  • Fetches the latest articles related to a company.
  • Example: /news/Tesla

2. Analyze News Sentiment

GET /analyze-news

  • Performs sentiment analysis on the extracted articles.

3. Compare News Articles

POST /compare-news

  • Performs comparative analysis.
  • Request Body:
{
  "api_key": "your-openai-api-key",
  "model_name": "gpt-4o-mini",
  "company_name": "Tesla"
}

4. Generate Hindi Summary

GET /hindi-summary

  • Returns the summarized analysis in Hindi and stores the speech file.

File Structure

β”œβ”€β”€ api.py               # FastAPI backend for news extraction, sentiment analysis, and comparison
β”œβ”€β”€ app.py               # Gradio frontend to interact with users
β”œβ”€β”€ llm_utils.py         # Handles OpenAI API calls for topic extraction and comparative analysis
β”œβ”€β”€ utils.py             # Utility functions for web scraping, sentiment analysis, and TTS
β”œβ”€β”€ requirements.txt     # Dependencies
└── README.md            # Project documentation

Assumptions and Limitations

  • Only extracts articles from The New York Times and BBC.
  • Requires a valid OpenAI API key for sentiment analysis and comparison.
  • Hindi speech output uses gTTS, which requires an internet connection.

Deployment

This project can be deployed on Hugging Face Spaces. To deploy:

  1. Push your repository to GitHub.
  2. Follow Hugging Face Spaces documentation for deployment.

Example Output

{
  "Company": "Tesla",
  "Articles": [
    {
      "Title": "Tesla's New Model Breaks Sales Records",
      "Summary": "Tesla's latest EV sees record sales in Q3...",
      "Sentiment": "Positive",
      "Topics": ["Electric Vehicles", "Stock Market", "Innovation"]
    }
  ],
  "Comparative Sentiment Score": {
    "Sentiment Distribution": {"Positive": 1, "Negative": 1, "Neutral": 0},
    "Coverage Differences": [{
      "Comparison": "Article 1 highlights Tesla's strong sales, while Article 2 discusses regulatory issues.",
      "Impact": "Investors may react positively to growth news but stay cautious due to regulatory scrutiny."
    }],
    "Topic Overlap": {
      "Common Topics": ["Electric Vehicles"],
      "Unique Topics in Article 1": ["Stock Market", "Innovation"],
      "Unique Topics in Article 2": ["Regulations", "Autonomous Vehicles"]
    }
  },
  "Final Sentiment Analysis": "Tesla’s latest news coverage is mostly positive. Potential stock growth expected.",
  "Audio": "[Play Hindi Speech]"
}

Contributing

Feel free to contribute by:

  • Adding more news sources
  • Improving the sentiment model
  • Enhancing the UI

Contact

For queries, reach out at [[email protected]].