File size: 2,770 Bytes
bf65dee
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
import gradio as gr
import pandas as pd
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer globally for efficiency
model_name = "tabularisai/multilingual-sentiment-analysis"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)


def predict_sentiment(texts):
    """
    Predict sentiment for a list of texts
    """
    inputs = tokenizer(texts, return_tensors="pt", truncation=True, padding=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
    sentiment_map = {
        0: "Very Negative",
        1: "Negative",
        2: "Neutral",
        3: "Positive",
        4: "Very Positive"
    }
    return [sentiment_map[p] for p in torch.argmax(probabilities, dim=-1).tolist()]


def process_file(file_obj):
    """
    Process the input file and add sentiment analysis results
    """
    try:
        # Read the file based on its extension
        file_path = file_obj.name
        if file_path.endswith('.csv'):
            df = pd.read_csv(file_path)
        elif file_path.endswith(('.xlsx', '.xls')):
            df = pd.read_excel(file_path)
        else:
            raise ValueError("Unsupported file format. Please upload a CSV or Excel file.")

        # Verify that 'Reviews' column exists
        if 'Reviews' not in df.columns:
            raise ValueError("Input file must contain a 'Reviews' column.")

        # Perform sentiment analysis
        reviews = df['Reviews'].fillna("")  # Handle any missing values
        sentiments = predict_sentiment(reviews.tolist())

        # Add results to the dataframe
        df['Sentiment'] = sentiments

        # Save the results to a new Excel file
        output_path = "output_with_sentiment.xlsx"
        df.to_excel(output_path, index=False)

        return df, output_path

    except Exception as e:
        raise gr.Error(str(e))


# Create Gradio interface
with gr.Blocks() as interface:
    gr.Markdown("# Review Sentiment Analysis")
    gr.Markdown("Upload an Excel or CSV file with a 'Reviews' column to analyze sentiment.")

    with gr.Row():
        file_input = gr.File(
            label="Upload File (CSV or Excel)",
            file_types=[".csv", ".xlsx", ".xls"]
        )

    with gr.Row():
        analyze_btn = gr.Button("Analyze Sentiments")

    with gr.Row():
        output_df = gr.Dataframe(label="Results Preview")
        output_file = gr.File(label="Download Results")

    analyze_btn.click(
        fn=process_file,
        inputs=[file_input],
        outputs=[output_df, output_file]
    )

# Launch the interface
interface.launch()