Spaces:
Running
AskVeracity Architecture
System Overview
AskVeracity is a fact-checking and misinformation detection application that verifies factual claims by gathering and analyzing evidence from multiple sources. The system follows an agentic approach using LangGraph's ReAct agent framework for orchestrating the verification process.
Core Components
1. Agent System
The system implements a LangGraph-based agent that orchestrates the entire fact-checking process:
- Core Agent: Defined in
agent.py
, the ReAct agent coordinates the execution of individual tools in a logical sequence to verify claims. - Agent Tools: Implemented as callable functions that the agent can invoke:
claim_extractor
: Extracts the main factual claim from user inputevidence_retriever
: Gathers evidence from multiple sourcestruth_classifier
: Evaluates the claim against evidenceexplanation_generator
: Creates human-readable explanations
2. Web Interface
The user interface is implemented using Streamlit:
- Main App: Defined in
app.py
, provides the interface for users to submit claims and view results - Caching: Uses Streamlit's caching mechanisms to optimize performance
- Results Display: Shows verdict, confidence, explanation, and evidence details
Module Architecture
askveracity/
β
βββ agent.py # LangGraph agent implementation
βββ app.py # Main Streamlit application
βββ config.py # Configuration and API keys
βββ evaluate_performance.py # Performance evaluation script
β
βββ modules/ # Core functionality modules
β βββ claim_extraction.py # Claim extraction functionality
β βββ evidence_retrieval.py # Evidence gathering from various sources
β βββ classification.py # Truth classification logic
β βββ explanation.py # Explanation generation
β βββ rss_feed.py # RSS feed evidence retrieval
β βββ category_detection.py # Claim category detection
β
βββ utils/ # Utility functions
β βββ api_utils.py # API rate limiting and error handling
β βββ performance.py # Performance tracking utilities
β βββ models.py # Model initialization functions
β
βββ results/ # Performance evaluation results
β βββ performance_results.json # Evaluation metrics
β βββ *.png # Performance visualization charts
β
βββ docs/ # Documentation
βββ assets/ # Images and other media
β βββ app_screenshot.png # Application screenshot
βββ architecture.md # System design and component interactions
βββ configuration.md # Setup and environment configuration
βββ data-handling.md # Data processing and flow
βββ changelog.md # Version history
Component Interactions
Claim Verification Flow
- User Input: User submits a claim via the Streamlit interface
- Agent Initialization: The ReAct agent is initialized with fact-checking tools
- Claim Extraction: The agent extracts the main factual claim
- Category Detection: The system detects the category of the claim (ai, science, technology, politics, business, world, sports, entertainment)
- Evidence Retrieval: Multi-source evidence gathering with priority based on claim category
- Evidence Analysis: Entity and verb matching assesses evidence relevance
- Truthfulness Classification: The agent evaluates the claim against the evidence
- Explanation Generation: Human-readable explanation is generated
- Results Display: Results are presented to the user with evidence details
Evidence Retrieval Architecture
Evidence retrieval is a core component of the misinformation detection system:
Multi-source Retrieval: The system collects evidence from:
- Wikipedia
- Wikidata
- News API
- RSS feeds
- Fact-checking sites (via Google Fact Check Tools API)
- Academic sources (via OpenAlex)
Category-aware Prioritization: Sources are prioritized based on the detected category of the claim:
- Each category (ai, science, technology, politics, business, world, sports, entertainment) has dedicated RSS feeds
- AI category falls back to technology sources when needed
- Other categories fall back to default RSS feeds
Parallel Processing: Evidence retrieval uses ThreadPoolExecutor for parallel API requests with optimized timeouts
Rate Limiting: API calls are managed by a token bucket rate limiter to respect API usage limits
Error Handling: Robust error handling with exponential backoff for retries
Source Verification: The system provides direct URLs to original sources for all evidence items, enabling users to verify information at its original source
Classification System
The truth classification process involves:
- Evidence Analysis: Each evidence item is classified as supporting, contradicting, or insufficient
- Confidence Scoring: Confidence scores are assigned to each classification
- Aggregation: Individual evidence classifications are aggregated to determine the final verdict
Technical Details
Language Models
- Uses OpenAI's Large Language Model GPT-3.5 Turbo via LangChain
- Configurable model selection in
utils/models.py
NLP Processing
- spaCy for natural language processing tasks
- Named entity recognition for claim and evidence analysis
- Entity and verb matching for evidence relevance scoring
Performance Optimization
- Caching of models and results
- Prioritized and parallel evidence retrieval
- Early relevance analysis during retrieval process
Error Resilience
- Multiple fallback mechanisms
- Graceful degradation when sources are unavailable
- Comprehensive error logging
Performance Evaluation Results
The system has been evaluated using a test set of 40 claims across three categories (True, False, and Uncertain). A typical performance profile shows:
Overall Accuracy: ~52.5% across all claim types
- Accuracy: Percentage of claims correctly classified according to their ground truth label
Safety Rate: ~70.0% across all claim types
- Safety Rate: Percentage of claims that were either correctly classified or safely categorized as "Uncertain" rather than making an incorrect assertion
Class-specific Metrics:
- True claims: ~40-60% accuracy, ~55-85% safety rate
- False claims: ~15-35% accuracy, ~50-70% safety rate
- Uncertain claims: ~50.0% accuracy, ~50.0% safety rate (for Uncertain claims, accuracy equals safety rate)
Confidence Scores:
- True claims: ~0.62-0.74 average confidence
- False claims: ~0.42-0.50 average confidence
- Uncertain claims: ~0.38-0.50 average confidence
Processing Times:
- True claims: ~21-32 seconds average
- False claims: ~24-37 seconds average
- Uncertain claims: ~23-31 seconds average
Note: The class-specific metrics, confidence scores, and processing times vary by test run.
These metrics vary between evaluation runs due to the dynamic nature of evidence sources and the real-time information landscape. The system is designed to adapt to this variability, making it well-suited for real-world fact-checking scenarios where information evolves over time.
Misinformation Detection Capabilities
The system's approach to detecting misinformation includes:
- Temporal Relevance: Checks if evidence is temporally appropriate for the claim
- Contradiction Detection: Identifies evidence that directly contradicts claims
- Evidence Diversity: Ensures diverse evidence sources for more robust verification
- Domain Prioritization: Applies a small relevance boost to content from established news and fact-checking domains in the RSS feed handling
- Safety-First Classification: Prioritizes preventing the spread of misinformation by avoiding incorrect assertions when evidence is insufficient
This architecture enables AskVeracity to efficiently gather, analyze, and present evidence relevant to user claims, supporting the broader effort to detect and counteract misinformation.