Spaces:
Running
Running
File size: 8,343 Bytes
6d11371 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
# AskVeracity Architecture
## System Overview
AskVeracity is a fact-checking and misinformation detection application that verifies factual claims by gathering and analyzing evidence from multiple sources. The system follows an agentic approach using LangGraph's ReAct agent framework for orchestrating the verification process.
## Core Components
### 1. Agent System
The system implements a LangGraph-based agent that orchestrates the entire fact-checking process:
- **Core Agent:** Defined in `agent.py`, the ReAct agent coordinates the execution of individual tools in a logical sequence to verify claims.
- **Agent Tools:** Implemented as callable functions that the agent can invoke:
- `claim_extractor`: Extracts the main factual claim from user input
- `evidence_retriever`: Gathers evidence from multiple sources
- `truth_classifier`: Evaluates the claim against evidence
- `explanation_generator`: Creates human-readable explanations
### 2. Web Interface
The user interface is implemented using Streamlit:
- **Main App:** Defined in `app.py`, provides the interface for users to submit claims and view results
- **Caching:** Uses Streamlit's caching mechanisms to optimize performance
- **Results Display:** Shows verdict, confidence, explanation, and evidence details
## Module Architecture
```
askveracity/
β
βββ agent.py # LangGraph agent implementation
βββ app.py # Main Streamlit application
βββ config.py # Configuration and API keys
βββ evaluate_performance.py # Performance evaluation script
β
βββ modules/ # Core functionality modules
β βββ claim_extraction.py # Claim extraction functionality
β βββ evidence_retrieval.py # Evidence gathering from various sources
β βββ classification.py # Truth classification logic
β βββ explanation.py # Explanation generation
β βββ rss_feed.py # RSS feed evidence retrieval
β βββ category_detection.py # Claim category detection
β
βββ utils/ # Utility functions
β βββ api_utils.py # API rate limiting and error handling
β βββ performance.py # Performance tracking utilities
β βββ models.py # Model initialization functions
β
βββ results/ # Performance evaluation results
β βββ performance_results.json # Evaluation metrics
β βββ *.png # Performance visualization charts
β
βββ docs/ # Documentation
βββ assets/ # Images and other media
β βββ app_screenshot.png # Application screenshot
βββ architecture.md # System design and component interactions
βββ configuration.md # Setup and environment configuration
βββ data-handling.md # Data processing and flow
βββ changelog.md # Version history
```
## Component Interactions
### Claim Verification Flow
1. **User Input:** User submits a claim via the Streamlit interface
2. **Agent Initialization:** The ReAct agent is initialized with fact-checking tools
3. **Claim Extraction:** The agent extracts the main factual claim
4. **Category Detection:** The system detects the category of the claim (ai, science, technology, politics, business, world, sports, entertainment)
5. **Evidence Retrieval:** Multi-source evidence gathering with priority based on claim category
6. **Evidence Analysis:** Entity and verb matching assesses evidence relevance
7. **Truthfulness Classification:** The agent evaluates the claim against the evidence
8. **Explanation Generation:** Human-readable explanation is generated
9. **Results Display:** Results are presented to the user with evidence details
### Evidence Retrieval Architecture
Evidence retrieval is a core component of the misinformation detection system:
1. **Multi-source Retrieval:** The system collects evidence from:
- Wikipedia
- Wikidata
- News API
- RSS feeds
- Fact-checking sites (via Google Fact Check Tools API)
- Academic sources (via OpenAlex)
2. **Category-aware Prioritization:** Sources are prioritized based on the detected category of the claim:
- Each category (ai, science, technology, politics, business, world, sports, entertainment) has dedicated RSS feeds
- AI category falls back to technology sources when needed
- Other categories fall back to default RSS feeds
3. **Parallel Processing:** Evidence retrieval uses ThreadPoolExecutor for parallel API requests with optimized timeouts
4. **Rate Limiting:** API calls are managed by a token bucket rate limiter to respect API usage limits
5. **Error Handling:** Robust error handling with exponential backoff for retries
6. **Source Verification:** The system provides direct URLs to original sources for all evidence items, enabling users to verify information at its original source
### Classification System
The truth classification process involves:
1. **Evidence Analysis:** Each evidence item is classified as supporting, contradicting, or insufficient
2. **Confidence Scoring:** Confidence scores are assigned to each classification
3. **Aggregation:** Individual evidence classifications are aggregated to determine the final verdict
## Technical Details
### Language Models
- Uses OpenAI's Large Language Model GPT-3.5 Turbo via LangChain
- Configurable model selection in `utils/models.py`
### NLP Processing
- spaCy for natural language processing tasks
- Named entity recognition for claim and evidence analysis
- Entity and verb matching for evidence relevance scoring
### Performance Optimization
- Caching of models and results
- Prioritized and parallel evidence retrieval
- Early relevance analysis during retrieval process
### Error Resilience
- Multiple fallback mechanisms
- Graceful degradation when sources are unavailable
- Comprehensive error logging
## Performance Evaluation Results
The system has been evaluated using a test set of 40 claims across three categories (True, False, and Uncertain). A typical performance profile shows:
1. **Overall Accuracy:** ~52.5% across all claim types
* Accuracy: Percentage of claims correctly classified according to their ground truth label
2. **Safety Rate:** ~70.0% across all claim types
* Safety Rate: Percentage of claims that were either correctly classified or safely categorized as "Uncertain" rather than making an incorrect assertion
3. **Class-specific Metrics:**
* True claims: ~40-60% accuracy, ~55-85% safety rate
* False claims: ~15-35% accuracy, ~50-70% safety rate
* Uncertain claims: ~50.0% accuracy, ~50.0% safety rate (for Uncertain claims, accuracy equals safety rate)
4. **Confidence Scores:**
* True claims: ~0.62-0.74 average confidence
* False claims: ~0.42-0.50 average confidence
* Uncertain claims: ~0.38-0.50 average confidence
5. **Processing Times:**
* True claims: ~21-32 seconds average
* False claims: ~24-37 seconds average
* Uncertain claims: ~23-31 seconds average
**Note:** The class-specific metrics, confidence scores, and processing times vary by test run.
These metrics vary between evaluation runs due to the dynamic nature of evidence sources and the real-time information landscape. The system is designed to adapt to this variability, making it well-suited for real-world fact-checking scenarios where information evolves over time.
## Misinformation Detection Capabilities
The system's approach to detecting misinformation includes:
1. **Temporal Relevance:** Checks if evidence is temporally appropriate for the claim
2. **Contradiction Detection:** Identifies evidence that directly contradicts claims
3. **Evidence Diversity:** Ensures diverse evidence sources for more robust verification
4. **Domain Prioritization:** Applies a small relevance boost to content from established news and fact-checking domains in the RSS feed handling
5. **Safety-First Classification:** Prioritizes preventing the spread of misinformation by avoiding incorrect assertions when evidence is insufficient
This architecture enables AskVeracity to efficiently gather, analyze, and present evidence relevant to user claims, supporting the broader effort to detect and counteract misinformation. |