ankanghosh commited on
Commit
1650d98
Β·
verified Β·
1 Parent(s): b4cd667

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +168 -0
README.md ADDED
@@ -0,0 +1,168 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AskVeracity: Fact Checking System
2
+
3
+ A streamlined web application that analyzes claims to determine their truthfulness through evidence gathering and analysis.
4
+
5
+ ## Overview
6
+
7
+ This application uses an agentic AI approach to verify factual claims through a combination of NLP techniques and large language models.
8
+
9
+ The AI agent:
10
+ 1. Uses a ReAct (Reasoning + Acting) methodology to analyze claims
11
+ 2. Dynamically gathers evidence from multiple sources (Wikipedia, News APIs, RSS feeds, fact-checking sites)
12
+ 3. Intelligently decides which tools to use and in what order based on the claim's category
13
+ 4. Classifies the truthfulness of claims using the collected evidence
14
+ 5. Provides transparency into its reasoning process
15
+ 6. Generates clear explanations for its verdict with confidence scores
16
+
17
+ ## Features
18
+
19
+ - **Claim Extraction**: Identifies and focuses on the primary factual claim
20
+ - **Category Detection**: Determines the claim's category to optimize evidence retrieval
21
+ - **Multi-source Evidence**: Gathers evidence from Wikipedia, news articles, academic sources, and fact-checking sites
22
+ - **Semantic Analysis**: Analyzes evidence relevance using advanced NLP techniques
23
+ - **Transparent Classification**: Provides clear verdicts with confidence scores
24
+ - **Detailed Explanations**: Generates human-readable explanations for verdicts
25
+ - **Interactive UI**: Easy-to-use Streamlit interface with evidence exploration
26
+
27
+ ## Project Structure
28
+
29
+ ```
30
+ askveracity/
31
+ β”‚
32
+ β”œβ”€β”€ app.py # Main Streamlit application
33
+ β”œβ”€β”€ agent.py # LangGraph agent implementation
34
+ β”œβ”€β”€ config.py # Configuration and API keys
35
+ β”œβ”€β”€ requirements.txt # Dependencies for the application
36
+ β”œβ”€β”€ .streamlit/ # Streamlit configuration
37
+ β”‚ β”œβ”€β”€ config.toml # UI theme configuration
38
+ β”‚ └── secrets.toml.example # Example secrets file (do not commit actual secrets)
39
+ β”œβ”€β”€ utils/
40
+ β”‚ β”œβ”€β”€ __init__.py
41
+ β”‚ β”œβ”€β”€ api_utils.py # API rate limiting and error handling
42
+ β”‚ β”œβ”€β”€ performance.py # Performance tracking utilities
43
+ β”‚ └── models.py # Model initialization functions
44
+ β”œβ”€β”€ modules/
45
+ β”‚ β”œβ”€β”€ __init__.py
46
+ β”‚ β”œβ”€β”€ claim_extraction.py # Claim extraction functionality
47
+ β”‚ β”œβ”€β”€ evidence_retrieval.py # Evidence gathering from various sources
48
+ β”‚ β”œβ”€β”€ classification.py # Truth classification logic
49
+ β”‚ β”œβ”€β”€ explanation.py # Explanation generation
50
+ β”‚ β”œβ”€β”€ rss_feed.py # RSS feed evidence retrieval
51
+ β”‚ β”œβ”€β”€ semantic_analysis.py # Relevance analysis for evidence
52
+ β”‚ └── category_detection.py # Claim category detection
53
+ β”œβ”€β”€ data/
54
+ β”‚ └── source_credibility.json # Source credibility data
55
+ └── tests/
56
+ β”œβ”€β”€ __init__.py
57
+ └── test_claim_extraction.py # Unit tests for claim extraction
58
+ ```
59
+
60
+ ## Setup and Installation
61
+
62
+ ### Local Development
63
+
64
+ 1. Clone this repository
65
+ ```
66
+ git clone https://github.com/yourusername/askveracity.git
67
+ cd askveracity
68
+ ```
69
+
70
+ 2. Install the required dependencies:
71
+ ```
72
+ pip install -r requirements.txt
73
+ ```
74
+
75
+ 3. Set up your API keys:
76
+
77
+ You have two options:
78
+
79
+ **Option 1: Using Streamlit secrets (recommended for local development)**
80
+
81
+ - Copy the example secrets file to create your own:
82
+ ```
83
+ cp .streamlit/secrets.toml.example .streamlit/secrets.toml
84
+ ```
85
+ - Edit `.streamlit/secrets.toml` and add your API keys:
86
+ ```toml
87
+ OPENAI_API_KEY = "your_openai_api_key"
88
+ NEWS_API_KEY = "your_news_api_key"
89
+ FACTCHECK_API_KEY = "your_factcheck_api_key"
90
+ ```
91
+
92
+ **Option 2: Using environment variables**
93
+
94
+ Create a `.env` file in the root directory with the following content:
95
+ ```
96
+ OPENAI_API_KEY=your_openai_api_key
97
+ NEWS_API_KEY=your_news_api_key
98
+ FACTCHECK_API_KEY=your_factcheck_api_key
99
+ ```
100
+
101
+ 4. When using environment variables, load them:
102
+
103
+ At the start of your Python script or in your terminal:
104
+ ```python
105
+ # In Python
106
+ from dotenv import load_dotenv
107
+ load_dotenv()
108
+ ```
109
+
110
+ Or in your terminal before running the app:
111
+ ```bash
112
+ # Unix/Linux/MacOS
113
+ source .env
114
+
115
+ # Windows
116
+ # Install python-dotenv[cli] and run
117
+ dotenv run streamlit run app.py
118
+ ```
119
+
120
+ ### Running the Application
121
+
122
+ Launch the Streamlit app by running:
123
+ ```
124
+ streamlit run app.py
125
+ ```
126
+
127
+ ### Deploying to Hugging Face Spaces
128
+
129
+ 1. Fork this repository to your GitHub account
130
+ 2. Create a new Space on Hugging Face:
131
+ - Go to https://huggingface.co/spaces
132
+ - Click "Create new Space"
133
+ - Select "Streamlit" as the SDK
134
+ - Choose "From GitHub" as the source
135
+ - Connect to your GitHub repository
136
+
137
+ 3. Add the required API keys as secrets:
138
+ - Go to the "Settings" tab of your Space
139
+ - Navigate to the "Repository secrets" section
140
+ - Add the following secrets:
141
+ - `OPENAI_API_KEY`
142
+ - `NEWS_API_KEY`
143
+ - `FACTCHECK_API_KEY`
144
+
145
+ 4. Your Space will automatically deploy with the changes
146
+
147
+ ## Rate Limiting and API Considerations
148
+
149
+ The application implements intelligent rate limiting for API calls to:
150
+ - Wikipedia
151
+ - WikiData
152
+ - News API
153
+ - Google FactCheck Tools
154
+ - RSS feeds
155
+
156
+ The system includes exponential backoff for retries and optimized API usage to work within free API tiers. Rate limits can be configured in the `config.py` file.
157
+
158
+ ## Best Practices for Claim Verification
159
+
160
+ For optimal results with AskVeracity:
161
+ - Keep claims short and precise
162
+ - Include key details in your claim
163
+ - Phrase claims as direct statements rather than questions
164
+ - Be specific about who said what, when relevant
165
+
166
+ ## License
167
+
168
+ This project is licensed under the [MIT License](./LICENSE), allowing free use, modification, and distribution with proper attribution.