Spaces:

rafaldembski
/

ScamDetector

Running

App Files Files Community

rafaldembski commited on Sep 29, 2024

Commit

2c03919

verified ·

1 Parent(s): 0e6f84f

Update utils/functions.py

Browse files

Files changed (1) hide show

utils/functions.py +55 -54

utils/functions.py CHANGED Viewed

@@ -19,10 +19,12 @@ def init_fake_numbers_file():
         with open(FAKE_NUMBERS_FILE, 'w') as f:
             json.dump([], f)
     else:
         try:
             with open(FAKE_NUMBERS_FILE, 'r') as f:
                 json.load(f)
         except json.JSONDecodeError:
             with open(FAKE_NUMBERS_FILE, 'w') as f:
                 json.dump([], f)
@@ -47,7 +49,7 @@ def add_fake_number(phone_number):
             logging.error(f"Nie udało się zapisać numeru {phone_number}: {e}")
             return False
     else:
-        return False
 # Sprawdzenie, czy numer telefonu jest w pliku JSON
 def is_fake_number(phone_number):
@@ -74,34 +76,30 @@ def get_phone_info(phone_number):
 # Proste sprawdzenia heurystyczne wiadomości
 def simple_checks(message):
     warnings = []
-    scam_keywords = {
-        'Polish': ['pieniądze', 'przelew', 'hasło', 'kod', 'nagroda', 'wygrana', 'pilne', 'pomoc', 'opłata',
-                   'płatność', 'transakcja', 'karta kredytowa', 'konto bankowe', 'login', 'weryfikacja', 'awaria',
-                   'fałszywe', 'zapłać', 'prześlij', 'złóż wniosek', 'aktywuj', 'dostęp', 'zweryfikuj', 'proszę podać',
-                   'debet', 'dłużnik', 'pożyczka', 'zwrot pieniędzy', 'potwierdzenie płatności'],
-        'German': ['geld', 'überweisung', 'passwort', 'code', 'gewinn', 'gewonnen', 'dringend', 'hilfe', 'zahlung',
-                   'gebühr', 'zahlungsempfänger', 'kreditkarte', 'bankkonto', 'login', 'verifizierung', 'ausfall',
-                   'fälschung', 'zahlen', 'übertragen', 'antrag einreichen', 'aktivieren', 'zugang', 'überprüfen',
-                   'bitte angeben', 'schuldner', 'kredit', 'rückerstattung', 'zahlung bestätigen'],
-        'English': ['money', 'transfer', 'password', 'code', 'prize', 'won', 'urgent', 'help', 'payment', 'fee',
-                    'transaction', 'credit card', 'bank account', 'login', 'verification', 'failure', 'fake', 'pay',
-                    'send', 'apply', 'activate', 'access', 'verify', 'please provide', 'debt', 'loan', 'refund',
-                    'payment confirmation']
-    }
-    for lang, keywords in scam_keywords.items():
-        if any(keyword in message.lower() for keyword in keywords):
-            warnings.append(f"Wiadomość zawiera słowa kluczowe związane z potencjalnym oszustwem ({lang}).")
     if re.search(r'http[s]?://', message):
         warnings.append("Wiadomość zawiera link.")
     if re.search(r'\b(podaj|prześlij|udostępnij)\b.*\b(hasło|kod|dane osobowe|numer konta)\b', message.lower()):
         warnings.append("Wiadomość zawiera prośbę o poufne informacje.")
     return warnings
-# Funkcja do analizy wiadomości za pomocą API SambaNova z zaawansowanym procesem myślowym
 def analyze_message(message, phone_number, additional_info, api_key, language):
     if not api_key:
         logging.error("Brak klucza API.")
@@ -111,82 +109,84 @@ def analyze_message(message, phone_number, additional_info, api_key, language):
     headers = {
         "Authorization": f"Bearer {api_key}"
     }
-    # Zaktualizowane prompts z jeszcze bardziej szczegółową analizą
     system_prompts = {
         'Polish': """
-Jesteś zaawansowanym asystentem AI specjalizującym się w identyfikacji fałszywych wiadomości SMS. Twoim zadaniem jest przeprowadzenie szczegółowej analizy wiadomości, wykorzystując głęboki proces myślenia i dostarczając kompleksową ocenę. Twoja odpowiedź powinna być podzielona na trzy sekcje:
 <analysis>
 **Analiza Treści Wiadomości:**
-- Przeprowadź szczegółową analizę treści wiadomości, identyfikując potencjalne czerwone flagi, takie jak błędy językowe, prośby o dane osobowe, pilne prośby o kontakt itp.
-- Opisz kontekst językowy i kulturowy wiadomości.
-- Zidentyfikuj wszelkie elementy, które mogą sugerować, że wiadomość jest próbą wyłudzenia informacji lub pieniędzy.
 </analysis>
 <risk_assessment>
 **Ocena Ryzyka Oszustwa:**
-- Na podstawie analizy treści i dostępnych informacji oceń prawdopodobieństwo, że wiadomość jest oszustwem. Użyj skali od 1 do 10, gdzie 1 oznacza bardzo niskie ryzyko, a 10 bardzo wysokie ryzyko.
-- Wyjaśnij, jakie czynniki wpływają na tę ocenę.
 </risk_assessment>
 <recommendations>
 **Zalecenia dla Użytkownika:**
-- Podaj jasne i konkretne zalecenia dotyczące dalszych kroków, które użytkownik powinien podjąć.
-- Uwzględnij sugestie dotyczące bezpieczeństwa, takie jak blokowanie nadawcy, zgłaszanie wiadomości do odpowiednich instytucji, czy też ignorowanie wiadomości.
-- Jeśli to możliwe, zasugeruj dodatkowe środki ostrożności, które użytkownik może podjąć, aby chronić swoje dane osobowe i finansowe.
 </recommendations>
         """,
         'German': """
-Du bist ein fortgeschrittener KI-Assistent, spezialisiert auf die Identifizierung gefälschter SMS-Nachrichten. Deine Aufgabe ist es, eine detaillierte Analyse der Nachricht durchzuführen, indem du einen tiefgreifenden Denkprozess nutzt und eine umfassende Bewertung lieferst. Deine Antwort sollte in drei Abschnitte unterteilt sein:
 <analysis>
 **Nachrichteninhaltsanalyse:**
-- Führe eine detaillierte Analyse des Nachrichteninhalts durch und identifiziere potenzielle rote Flaggen wie sprachliche Fehler, Aufforderungen zur Preisgabe persönlicher Daten, dringende Kontaktanfragen usw.
-- Beschreibe den sprachlichen und kulturellen Kontext der Nachricht.
-- Identifiziere alle Elemente, die darauf hindeuten könnten, dass die Nachricht ein Versuch ist, Informationen oder Geld zu erlangen.
 </analysis>
 <risk_assessment>
 **Betrugsrisikobewertung:**
-- Basierend auf der Inhaltsanalyse und den verfügbaren Informationen, bewerte die Wahrscheinlichkeit, dass die Nachricht ein Betrug ist. Verwende eine Skala von 1 bis 10, wobei 1 sehr geringes Risiko und 10 sehr hohes Risiko bedeutet.
-- Erkläre, welche Faktoren diese Bewertung beeinflussen.
 </risk_assessment>
 <recommendations>
 **Empfehlungen für den Benutzer:**
-- Gib klare und konkrete Empfehlungen zu den nächsten Schritten, die der Benutzer unternehmen sollte.
-- Berücksichtige Sicherheitsempfehlungen wie das Blockieren des Absenders, das Melden der Nachricht an entsprechende Behörden oder das Ignorieren der Nachricht.
-- Wenn möglich, schlage zusätzliche Vorsichtsmaßnahmen vor, die der Benutzer ergreifen kann, um seine persönlichen und finanziellen Daten zu schützen.
 </recommendations>
         """,
         'English': """
-You are an advanced AI assistant specializing in identifying fake SMS messages. Your task is to conduct a detailed analysis of the message, utilizing a deep thinking process and providing a comprehensive assessment. Your response should be divided into three sections:
 <analysis>
 **Message Content Analysis:**
-- Conduct a detailed analysis of the message content, identifying potential red flags such as language errors, requests for personal information, urgent contact requests, etc.
-- Describe the linguistic and cultural context of the message.
-- Identify any elements that may suggest the message is an attempt to solicit information or money.
 </analysis>
 <risk_assessment>
 **Fraud Risk Assessment:**
-- Based on the content analysis and available information, assess the likelihood that the message is fraudulent. Use a scale from 1 to 10, where 1 indicates very low risk and 10 indicates very high risk.
-- Explain the factors that influence this assessment.
 </risk_assessment>
 <recommendations>
 **User Recommendations:**
-- Provide clear and concrete recommendations regarding the next steps the user should take.
-- Include security suggestions such as blocking the sender, reporting the message to appropriate authorities, or ignoring the message.
-- If possible, suggest additional precautionary measures the user can take to protect their personal and financial information.
 </recommendations>
         """
     }
     system_prompt = system_prompts.get(language, system_prompts['English'])  # Default to English if language not found
     user_prompt = f"""Analyze the following message for potential fraud:
 Message: "{message}"
@@ -214,6 +214,7 @@ Provide your analysis and conclusions following the guidelines above."""
         if response.status_code == 200:
             data = response.json()
             ai_response = data['choices'][0]['message']['content']
             analysis = re.search(r'<analysis>(.*?)</analysis>', ai_response, re.DOTALL)
             risk_assessment = re.search(r'<risk_assessment>(.*?)</risk_assessment>', ai_response, re.DOTALL)
             recommendations = re.search(r'<recommendations>(.*?)</recommendations>', ai_response, re.DOTALL)
@@ -295,7 +296,7 @@ def add_to_history(message, phone_number, analysis, risk, recommendations):
 def get_history():
     history_file = 'history.json'
     try:
-        with open(history_file, 'r') as f:
             history = json.load(f)
         return history
     except (json.JSONDecodeError, FileNotFoundError):

         with open(FAKE_NUMBERS_FILE, 'w') as f:
             json.dump([], f)
     else:
+        # Sprawdzenie, czy plik nie jest pusty i zawiera prawidłowy JSON
         try:
             with open(FAKE_NUMBERS_FILE, 'r') as f:
                 json.load(f)
         except json.JSONDecodeError:
+            # Jeśli plik jest uszkodzony lub pusty, zresetuj go do pustej listy
             with open(FAKE_NUMBERS_FILE, 'w') as f:
                 json.dump([], f)
             logging.error(f"Nie udało się zapisać numeru {phone_number}: {e}")
             return False
     else:
+        return False  # Numer już istnieje
 # Sprawdzenie, czy numer telefonu jest w pliku JSON
 def is_fake_number(phone_number):
 # Proste sprawdzenia heurystyczne wiadomości
 def simple_checks(message):
     warnings = []
+    # Rozbudowana baza słów kluczowych w trzech językach (polski, niemiecki, angielski)
+    scam_keywords = [
+        'pieniądze', 'przelew', 'hasło', 'kod', 'nagroda', 'wygrana', 'pilne', 'pomoc', 'opłata', 'konto', 'bank',
+        'spłata', 'dług', 'problem', 'uwaga', 'natychmiast', 'transakcja', 'podaj dane', 'cena', 'bonus', 'transakcja',
+        'alert', 'bezpieczeństwo', 'oszustwo', 'blokada', 'abonament', 'naliczenie', 'dług', 'monit', 'prośba',
+        'Überweisung', 'Passwort', 'Geld', 'dringend', 'Gewinn', 'Betrag', 'Schuld', 'Konto', 'Hilfe', 'Problem',
+        'Sofort', 'Transaktion', 'Daten angeben', 'Belohnung', 'Gewinnspiel', 'Bonus', 'Erinnerung', 'Warnung',
+        'sicher', 'Verdacht', 'blockiert', 'Zahlung', 'Zahlungsforderung', 'betrug', 'Bankkonto',
+        'Award', 'prize', 'account', 'urgent', 'money', 'transfer', 'help', 'password', 'code', 'payment',
+        'emergency', 'win', 'problem', 'immediate', 'transaction', 'bank account', 'debt', 'discount', 'offer',
+        'security', 'alert', 'fraud', 'blocked', 'subscription', 'charge', 'request', 'notice'
+    ]
+    if any(keyword in message.lower() for keyword in scam_keywords):
+        warnings.append("Wiadomość zawiera słowa kluczowe związane z potencjalnym oszustwem.")
+    # Sprawdzenie obecności linków
     if re.search(r'http[s]?://', message):
         warnings.append("Wiadomość zawiera link.")
+    # Sprawdzenie, czy nadawca prosi o poufne informacje
     if re.search(r'\b(podaj|prześlij|udostępnij)\b.*\b(hasło|kod|dane osobowe|numer konta)\b', message.lower()):
         warnings.append("Wiadomość zawiera prośbę o poufne informacje.")
     return warnings
+# Funkcja do analizy wiadomości za pomocą API SambaNova
 def analyze_message(message, phone_number, additional_info, api_key, language):
     if not api_key:
         logging.error("Brak klucza API.")
     headers = {
         "Authorization": f"Bearer {api_key}"
     }
+    # Rozbudowane system prompts
     system_prompts = {
         'Polish': """
+Jesteś zaawansowanym asystentem AI specjalizującym się w identyfikacji fałszywych wiadomości SMS. Twoim zadaniem jest przeprowadzenie szczegółowej, wieloetapowej analizy wiadomości. Twój system myślenia musi przechodzić przez każdą potencjalną czerwoną flagę, szczegółowo oceniać ryzyka i dostarczać złożoną, lecz czytelną analizę. Twoja odpowiedź powinna być podzielona na trzy sekcje:
 <analysis>
 **Analiza Treści Wiadomości:**
+- Zidentyfikuj wszelkie potencjalne czerwone flagi, takie jak: błędy językowe, prośby o dane osobowe, pilne prośby o kontakt, sugestie dotyczące transferów finansowych itp.
+- Ocen językowo-kulturowy kontekst wiadomości.
+- Ocen wszelkie elementy, które mogą sugerować próbę wyłudzenia informacji lub pieniędzy.
 </analysis>
 <risk_assessment>
 **Ocena Ryzyka Oszustwa:**
+- Oceń prawdopodobieństwo, że wiadomość jest oszustwem na skali od 1 do 10.
+- Zidentyfikuj i wyjaśnij czynniki, które wpływają na tę ocenę. Skoncentruj się na detalach, takich jak struktura wiadomości, użyte wyrażenia i kontekst.
 </risk_assessment>
 <recommendations>
 **Zalecenia dla Użytkownika:**
+- Przedstaw jasne i konkretne kroki, które użytkownik powinien podjąć.
+- Zaproponuj opcje, takie jak blokowanie nadawcy, zgłoszenie wiadomości lub ignorowanie jej.
+- Jeśli to możliwe, zaproponuj dodatkowe środki ostrożności, które użytkownik może podjąć, aby chronić swoje dane.
 </recommendations>
+Upewnij się, że każda sekcja jest wypełniona szczegółowo, przechodząc przez każdy potencjalny aspekt wiadomości.
         """,
         'German': """
+Du bist ein fortgeschrittener KI-Assistent, spezialisiert auf die Identifizierung gefälschter SMS-Nachrichten. Deine Aufgabe ist es, eine detaillierte, schrittweise Analyse der Nachricht durchzuführen. Deine Antwort sollte in drei Abschnitte unterteilt sein:
 <analysis>
 **Nachrichteninhaltsanalyse:**
+- Identifiziere potenzielle rote Flaggen wie: sprachliche Fehler, Anfragen nach persönlichen Daten, dringende Kontaktanfragen, Hinweise auf Finanztransaktionen.
+- Berücksichtige den sprachlich-kulturellen Kontext der Nachricht.
+- Bewerte alle Elemente, die auf einen Betrugsversuch hinweisen könnten.
 </analysis>
 <risk_assessment>
 **Betrugsrisikobewertung:**
+- Bewerte die Wahrscheinlichkeit, dass die Nachricht ein Betrug ist, auf einer Skala von 1 bis 10.
+- Erläutere die Faktoren, die diese Bewertung beeinflussen, einschließlich der Struktur der Nachricht und der verwendeten Ausdrücke.
 </risk_assessment>
 <recommendations>
 **Empfehlungen für den Benutzer:**
+- Gib klare und konkrete Handlungsempfehlungen.
+- Schlage Maßnahmen vor, wie das Blockieren des Absenders, das Melden der Nachricht oder das Ignorieren.
+- Gib zusätzliche Sicherheitsvorkehrungen an, um persönliche und finanzielle Daten zu schützen.
 </recommendations>
         """,
         'English': """
+You are an advanced AI assistant specializing in identifying fake SMS messages. Your task is to conduct a detailed, step-by-step analysis of the message, using a deep thinking process. Your response should be divided into three sections:
 <analysis>
 **Message Content Analysis:**
+- Identify potential red flags, such as: language errors, requests for personal information, urgent contact requests, suggestions of financial transfers, etc.
+- Evaluate the linguistic and cultural context of the message.
+- Assess any elements that may suggest an attempt to solicit information or money.
 </analysis>
 <risk_assessment>
 **Fraud Risk Assessment:**
+- Evaluate the likelihood that the message is fraudulent on a scale from 1 to 10.
+- Identify and explain the factors that influence this assessment, focusing on message structure, expressions used, and context.
 </risk_assessment>
 <recommendations>
 **User Recommendations:**
+- Provide clear and specific actions the user should take.
+- Suggest options like blocking the sender, reporting the message, or ignoring it.
+- If possible, recommend additional precautions to protect the user’s personal and financial data.
 </recommendations>
         """
     }
     system_prompt = system_prompts.get(language, system_prompts['English'])  # Default to English if language not found
     user_prompt = f"""Analyze the following message for potential fraud:
 Message: "{message}"
         if response.status_code == 200:
             data = response.json()
             ai_response = data['choices'][0]['message']['content']
+            # Parsowanie odpowiedzi
             analysis = re.search(r'<analysis>(.*?)</analysis>', ai_response, re.DOTALL)
             risk_assessment = re.search(r'<risk_assessment>(.*?)</risk_assessment>', ai_response, re.DOTALL)
             recommendations = re.search(r'<recommendations>(.*?)</recommendations>', ai_response, re.DOTALL)
 def get_history():
     history_file = 'history.json'
     try:
+        with open(history_file, 'r') as f):
             history = json.load(f)
         return history
     except (json.JSONDecodeError, FileNotFoundError):