Spaces:
Build error
Build error
Deploy Whale_Arbitrum on HF Spaces
Browse files- .env +11 -0
- README.md +135 -0
- app.py +719 -0
- modules/__init__.py +1 -0
- modules/__pycache__/__init__.cpython-312.pyc +0 -0
- modules/__pycache__/api_client.cpython-312.pyc +0 -0
- modules/__pycache__/crew_system.cpython-312.pyc +0 -0
- modules/__pycache__/crew_tools.cpython-312.pyc +0 -0
- modules/__pycache__/data_processor.cpython-312.pyc +0 -0
- modules/__pycache__/detection.cpython-312.pyc +0 -0
- modules/__pycache__/visualizer.cpython-312.pyc +0 -0
- modules/api_client.py +768 -0
- modules/crew_system.py +1117 -0
- modules/crew_tools.py +362 -0
- modules/data_processor.py +1425 -0
- modules/detection.py +684 -0
- modules/tools.py +373 -0
- modules/visualizer.py +638 -0
- requirements.txt +12 -0
- test_api.py +205 -0
.env
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Your current API key appears to be having issues
|
2 |
+
# Please replace it with your own key from https://arbiscan.io/myapikey
|
3 |
+
# Uncomment one of the API keys below or add your own
|
4 |
+
ARBISCAN_API_KEY=4YEN1UTUEZ8I8ZBWSZW5NH6ZDFYEUVKQ5U
|
5 |
+
# ARBISCAN_API_KEY=HVZC2W3IZWCGJWS8QDBZ56D1GZZNDJMZ25
|
6 |
+
|
7 |
+
# Gemini API key for price data
|
8 |
+
GEMINI_API_KEY=AIzaSyCyble5D3dlgPxDXWLlaZmu8hOM_nt-V6M
|
9 |
+
|
10 |
+
# OpenAI API key for CrewAI functionality
|
11 |
+
OPENAI_API_KEY=your-openai-api-key
|
README.md
ADDED
@@ -0,0 +1,135 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Whale Wallet AI β Market Manipulation Detection
|
2 |
+
|
3 |
+
A powerful Streamlit-based tool that tracks large holders ("whales") on the Arbitrum network to uncover potential market manipulation tactics.
|
4 |
+
|
5 |
+
## 1. Prerequisites & Setup
|
6 |
+
|
7 |
+
### 1.1. Python & Dependencies
|
8 |
+
- Ensure you have Python 3.8+ installed.
|
9 |
+
- Install required packages via:
|
10 |
+
```bash
|
11 |
+
pip install -r requirements.txt
|
12 |
+
```
|
13 |
+
|
14 |
+
### 1.2. API Keys
|
15 |
+
You need API keys to fetch on-chain data and real-time prices:
|
16 |
+
- **ARBISCAN_API_KEY**: For fetching Arbitrum transaction data
|
17 |
+
- **GEMINI_API_KEY**: For retrieving live token prices
|
18 |
+
- **OPENAI_API_KEY**: For powering the CrewAI agents
|
19 |
+
|
20 |
+
Save these in a file named `.env` at the project root:
|
21 |
+
```env
|
22 |
+
ARBISCAN_API_KEY=your_arbiscan_key
|
23 |
+
GEMINI_API_KEY=your_gemini_key
|
24 |
+
OPENAI_API_KEY=your_openai_key
|
25 |
+
```
|
26 |
+
Note: Sample API keys are provided in the default .env file, but you should replace them with your own for production use.
|
27 |
+
|
28 |
+
### 1.3. Run the App
|
29 |
+
Launch the web interface with:
|
30 |
+
```bash
|
31 |
+
streamlit run app.py
|
32 |
+
```
|
33 |
+
|
34 |
+
## 2. Core Features & How to Use Them
|
35 |
+
|
36 |
+
### 2.1 Track Large Buy/Sell Transactions
|
37 |
+
|
38 |
+
**What it does:**
|
39 |
+
Monitors on-chain transfers exceeding a configurable threshold (e.g., 1,000 tokens or $100K) for any wallet or contract you specify.
|
40 |
+
|
41 |
+
**How to use:**
|
42 |
+
1. In the sidebar, enter one or more wallet addresses
|
43 |
+
2. Set your minimum token or USD value filter
|
44 |
+
3. Click **Track Transactions**
|
45 |
+
4. The dashboard will list incoming/outgoing transfers above the threshold.
|
46 |
+
|
47 |
+
### 2.2 Identify Trading Patterns of Whale Wallets
|
48 |
+
|
49 |
+
**What it does:**
|
50 |
+
Uses time-series clustering and sequence analysis to surface recurring behaviors (e.g., cyclical dumping, accumulation bursts).
|
51 |
+
|
52 |
+
**How to use:**
|
53 |
+
1. Select a wallet address
|
54 |
+
2. Choose a time period (e.g., last 7 days)
|
55 |
+
3. Click **Analyze Patterns**
|
56 |
+
4. View a summary of detected clusters and drill down into individual events.
|
57 |
+
|
58 |
+
### 2.3 Analyze Impact of Whale Transactions on Token Prices
|
59 |
+
|
60 |
+
**What it does:**
|
61 |
+
Correlates large trades against minute-by-minute price ticks to quantify slippage, price spikes, or dumps.
|
62 |
+
|
63 |
+
**How to use:**
|
64 |
+
1. Enable **Price Impact** analysis in settings
|
65 |
+
2. Specify lookback/lookahead windows (e.g., 5 minutes)
|
66 |
+
3. Click **Run Impact Analysis**
|
67 |
+
4. See interactive line charts and slippage metrics.
|
68 |
+
|
69 |
+
### 2.4 Detect Potential Market Manipulation Techniques
|
70 |
+
|
71 |
+
**What it does:**
|
72 |
+
Automatically flags suspicious behaviors such as:
|
73 |
+
- **Pump-and-Dump:** Rapid buys followed by coordinated sell-offs
|
74 |
+
- **Wash Trading:** Self-trading across multiple addresses
|
75 |
+
- **Spoofing:** Large orders placed then canceled
|
76 |
+
|
77 |
+
**How to use:**
|
78 |
+
1. Toggle **Manipulation Detection** on
|
79 |
+
2. Adjust sensitivity slider (Low/Medium/High)
|
80 |
+
3. Click **Detect**
|
81 |
+
4. Examine the **Alerts** panel for flagged events.
|
82 |
+
|
83 |
+
### 2.5 Generate Reports & Visualizations
|
84 |
+
|
85 |
+
**What it does:**
|
86 |
+
Compiles whale activity into PDF/CSV summaries and interactive charts.
|
87 |
+
|
88 |
+
**How to use:**
|
89 |
+
1. Select **Export** in the top menu
|
90 |
+
2. Choose **CSV**, **PDF**, or **PNG**
|
91 |
+
3. Specify time range and wallets to include
|
92 |
+
4. Click **Download**
|
93 |
+
5. Saved file will appear in your browser's download folder.
|
94 |
+
|
95 |
+
## 3. Advanced Features: CrewAI Integration
|
96 |
+
|
97 |
+
This application leverages CrewAI to provide advanced analysis through specialized AI agents:
|
98 |
+
|
99 |
+
- **Blockchain Data Collector**: Extracts and organizes on-chain data
|
100 |
+
- **Price Impact Analyst**: Correlates trading activity with price movements
|
101 |
+
- **Trading Pattern Detector**: Identifies recurring behavioral patterns
|
102 |
+
- **Market Manipulation Investigator**: Detects potential market abuse
|
103 |
+
- **Insights Reporter**: Transforms data into actionable intelligence
|
104 |
+
|
105 |
+
## 4. Project Structure
|
106 |
+
|
107 |
+
```
|
108 |
+
/Whale_Arbitrum/
|
109 |
+
βββ app.py # Main Streamlit application entry point
|
110 |
+
βββ requirements.txt # Dependencies and package versions
|
111 |
+
βββ .env # API keys and environment variables
|
112 |
+
βββ modules/
|
113 |
+
β βββ api_client.py # Arbiscan and Gemini API clients
|
114 |
+
β βββ data_processor.py # Data processing and analysis
|
115 |
+
β βββ detection.py # Market manipulation detection algorithms
|
116 |
+
β βββ visualizer.py # Visualization and report generation
|
117 |
+
β βββ crew_system.py # CrewAI agentic system
|
118 |
+
```
|
119 |
+
|
120 |
+
## 5. Use Cases
|
121 |
+
|
122 |
+
- **Regulatory Compliance & Fraud Detection**
|
123 |
+
Auditors and regulators can monitor DeFi markets for wash trades and suspicious dumps.
|
124 |
+
|
125 |
+
- **Investment Strategy Optimization**
|
126 |
+
Traders gain insight into institutional flows and can calibrate entry/exit points.
|
127 |
+
|
128 |
+
- **Market Research & Analysis**
|
129 |
+
Researchers study whale behavior to gauge token health and potential volatility.
|
130 |
+
|
131 |
+
- **DeFi Protocol Security Monitoring**
|
132 |
+
Protocol teams receive alerts on large dumps that may destabilize liquidity pools.
|
133 |
+
|
134 |
+
- **Token Project Risk Assessment**
|
135 |
+
Token issuers review top-holder actions to flag governance or distribution issues.
|
app.py
ADDED
@@ -0,0 +1,719 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import streamlit as st
|
2 |
+
import pandas as pd
|
3 |
+
import numpy as np
|
4 |
+
import plotly.express as px
|
5 |
+
import plotly.graph_objects as go
|
6 |
+
import os
|
7 |
+
import json
|
8 |
+
import logging
|
9 |
+
import time
|
10 |
+
from datetime import datetime, timedelta
|
11 |
+
from typing import Dict, List, Optional, Union, Any
|
12 |
+
from dotenv import load_dotenv
|
13 |
+
|
14 |
+
# Configure logging - Reduce verbosity and improve performance
|
15 |
+
logging.basicConfig(
|
16 |
+
level=logging.WARNING, # Only show warnings and errors by default
|
17 |
+
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
|
18 |
+
)
|
19 |
+
|
20 |
+
# Create a custom filter to suppress repetitive Gemini API errors
|
21 |
+
class SuppressRepetitiveErrors(logging.Filter):
|
22 |
+
def __init__(self):
|
23 |
+
super().__init__()
|
24 |
+
self.error_counts = {}
|
25 |
+
self.max_errors = 3 # Show at most 3 instances of each error
|
26 |
+
|
27 |
+
def filter(self, record):
|
28 |
+
if record.levelno < logging.WARNING:
|
29 |
+
return True
|
30 |
+
|
31 |
+
# If it's a Gemini API error for non-existent tokens, suppress it after a few occurrences
|
32 |
+
if 'Error fetching historical prices from Gemini API' in record.getMessage():
|
33 |
+
key = 'gemini_api_error'
|
34 |
+
self.error_counts[key] = self.error_counts.get(key, 0) + 1
|
35 |
+
|
36 |
+
# Only allow the first few errors through
|
37 |
+
return self.error_counts[key] <= self.max_errors
|
38 |
+
|
39 |
+
return True
|
40 |
+
|
41 |
+
# Apply the filter
|
42 |
+
logging.getLogger().addFilter(SuppressRepetitiveErrors())
|
43 |
+
|
44 |
+
from modules.api_client import ArbiscanClient, GeminiClient
|
45 |
+
from modules.data_processor import DataProcessor
|
46 |
+
from modules.visualizer import Visualizer
|
47 |
+
from modules.detection import ManipulationDetector
|
48 |
+
|
49 |
+
# Load environment variables
|
50 |
+
load_dotenv()
|
51 |
+
|
52 |
+
# Set page configuration
|
53 |
+
st.set_page_config(
|
54 |
+
page_title="Whale Wallet AI - Market Manipulation Detection",
|
55 |
+
page_icon="π³",
|
56 |
+
layout="wide",
|
57 |
+
initial_sidebar_state="expanded"
|
58 |
+
)
|
59 |
+
|
60 |
+
# Add custom CSS
|
61 |
+
st.markdown("""
|
62 |
+
<style>
|
63 |
+
.main-header {
|
64 |
+
font-size: 2.5rem;
|
65 |
+
color: #1E88E5;
|
66 |
+
text-align: center;
|
67 |
+
margin-bottom: 1rem;
|
68 |
+
}
|
69 |
+
.sub-header {
|
70 |
+
font-size: 1.5rem;
|
71 |
+
color: #424242;
|
72 |
+
margin-bottom: 1rem;
|
73 |
+
}
|
74 |
+
.info-text {
|
75 |
+
background-color: #E3F2FD;
|
76 |
+
padding: 1rem;
|
77 |
+
border-radius: 0.5rem;
|
78 |
+
margin-bottom: 1rem;
|
79 |
+
}
|
80 |
+
.stButton>button {
|
81 |
+
width: 100%;
|
82 |
+
}
|
83 |
+
</style>
|
84 |
+
""", unsafe_allow_html=True)
|
85 |
+
|
86 |
+
# Initialize Streamlit session state for persisting data between tab navigation
|
87 |
+
if 'transactions_data' not in st.session_state:
|
88 |
+
st.session_state.transactions_data = pd.DataFrame()
|
89 |
+
|
90 |
+
if 'patterns_data' not in st.session_state:
|
91 |
+
st.session_state.patterns_data = None
|
92 |
+
|
93 |
+
if 'price_impact_data' not in st.session_state:
|
94 |
+
st.session_state.price_impact_data = None
|
95 |
+
|
96 |
+
# Performance metrics tracking
|
97 |
+
if 'performance_metrics' not in st.session_state:
|
98 |
+
st.session_state.performance_metrics = {
|
99 |
+
'api_calls': 0,
|
100 |
+
'data_processing_time': 0,
|
101 |
+
'visualization_time': 0,
|
102 |
+
'last_refresh': None
|
103 |
+
}
|
104 |
+
|
105 |
+
# Function to track performance
|
106 |
+
def track_timing(category: str):
|
107 |
+
def timing_decorator(func):
|
108 |
+
def wrapper(*args, **kwargs):
|
109 |
+
start_time = time.time()
|
110 |
+
result = func(*args, **kwargs)
|
111 |
+
elapsed = time.time() - start_time
|
112 |
+
|
113 |
+
if category in st.session_state.performance_metrics:
|
114 |
+
st.session_state.performance_metrics[category] += elapsed
|
115 |
+
else:
|
116 |
+
st.session_state.performance_metrics[category] = elapsed
|
117 |
+
|
118 |
+
return result
|
119 |
+
return wrapper
|
120 |
+
return timing_decorator
|
121 |
+
|
122 |
+
if 'alerts_data' not in st.session_state:
|
123 |
+
st.session_state.alerts_data = None
|
124 |
+
|
125 |
+
# Initialize API clients
|
126 |
+
arbiscan_client = ArbiscanClient(os.getenv("ARBISCAN_API_KEY"))
|
127 |
+
# Set debug mode to False to reduce log output
|
128 |
+
arbiscan_client.verbose_debug = False
|
129 |
+
gemini_client = GeminiClient(os.getenv("GEMINI_API_KEY"))
|
130 |
+
|
131 |
+
# Initialize data processor and visualizer
|
132 |
+
data_processor = DataProcessor()
|
133 |
+
visualizer = Visualizer()
|
134 |
+
|
135 |
+
# Apply performance tracking to key instance methods after initialization
|
136 |
+
original_fetch_whale = arbiscan_client.fetch_whale_transactions
|
137 |
+
arbiscan_client.fetch_whale_transactions = track_timing('api_calls')(original_fetch_whale)
|
138 |
+
|
139 |
+
original_identify_patterns = data_processor.identify_patterns
|
140 |
+
data_processor.identify_patterns = track_timing('data_processing_time')(original_identify_patterns)
|
141 |
+
|
142 |
+
original_analyze_price_impact = data_processor.analyze_price_impact
|
143 |
+
data_processor.analyze_price_impact = track_timing('data_processing_time')(original_analyze_price_impact)
|
144 |
+
detection = ManipulationDetector()
|
145 |
+
|
146 |
+
# Initialize crew system (for AI-assisted analysis)
|
147 |
+
try:
|
148 |
+
from modules.crew_system import WhaleAnalysisCrewSystem
|
149 |
+
crew_system = WhaleAnalysisCrewSystem(arbiscan_client, gemini_client, data_processor)
|
150 |
+
CREW_ENABLED = True
|
151 |
+
logging.info("CrewAI system loaded successfully")
|
152 |
+
except Exception as e:
|
153 |
+
CREW_ENABLED = False
|
154 |
+
logging.error(f"Failed to load CrewAI system: {str(e)}")
|
155 |
+
st.sidebar.error("CrewAI features are disabled due to an error.")
|
156 |
+
|
157 |
+
# Sidebar for inputs
|
158 |
+
st.sidebar.header("Configuration")
|
159 |
+
|
160 |
+
# Wallet tracking section
|
161 |
+
st.sidebar.subheader("Track Wallets")
|
162 |
+
wallet_addresses = st.sidebar.text_area(
|
163 |
+
"Enter wallet addresses (one per line)",
|
164 |
+
placeholder="0x1234abcd...\n0xabcd1234..."
|
165 |
+
)
|
166 |
+
|
167 |
+
threshold_type = st.sidebar.radio(
|
168 |
+
"Threshold Type",
|
169 |
+
["Token Amount", "USD Value"]
|
170 |
+
)
|
171 |
+
|
172 |
+
if threshold_type == "Token Amount":
|
173 |
+
threshold_value = st.sidebar.number_input("Minimum Token Amount", min_value=0.0, value=1000.0)
|
174 |
+
token_symbol = st.sidebar.text_input("Token Symbol", placeholder="ETH")
|
175 |
+
else:
|
176 |
+
threshold_value = st.sidebar.number_input("Minimum USD Value", min_value=0.0, value=100000.0)
|
177 |
+
|
178 |
+
# Time period selection
|
179 |
+
st.sidebar.subheader("Time Period")
|
180 |
+
time_period = st.sidebar.selectbox(
|
181 |
+
"Select Time Period",
|
182 |
+
["Last 24 hours", "Last 7 days", "Last 30 days", "Custom"]
|
183 |
+
)
|
184 |
+
|
185 |
+
if time_period == "Custom":
|
186 |
+
start_date = st.sidebar.date_input("Start Date", datetime.now() - timedelta(days=7))
|
187 |
+
end_date = st.sidebar.date_input("End Date", datetime.now())
|
188 |
+
else:
|
189 |
+
# Calculate dates based on selection
|
190 |
+
end_date = datetime.now()
|
191 |
+
if time_period == "Last 24 hours":
|
192 |
+
start_date = end_date - timedelta(days=1)
|
193 |
+
elif time_period == "Last 7 days":
|
194 |
+
start_date = end_date - timedelta(days=7)
|
195 |
+
else: # Last 30 days
|
196 |
+
start_date = end_date - timedelta(days=30)
|
197 |
+
|
198 |
+
# Manipulation detection settings
|
199 |
+
st.sidebar.subheader("Manipulation Detection")
|
200 |
+
enable_manipulation_detection = st.sidebar.toggle("Enable Manipulation Detection", value=True)
|
201 |
+
if enable_manipulation_detection:
|
202 |
+
sensitivity = st.sidebar.select_slider(
|
203 |
+
"Detection Sensitivity",
|
204 |
+
options=["Low", "Medium", "High"],
|
205 |
+
value="Medium"
|
206 |
+
)
|
207 |
+
|
208 |
+
# Price impact analysis settings
|
209 |
+
st.sidebar.subheader("Price Impact Analysis")
|
210 |
+
enable_price_impact = st.sidebar.toggle("Enable Price Impact Analysis", value=True)
|
211 |
+
if enable_price_impact:
|
212 |
+
lookback_minutes = st.sidebar.slider("Lookback (minutes)", 1, 60, 5)
|
213 |
+
lookahead_minutes = st.sidebar.slider("Lookahead (minutes)", 1, 60, 5)
|
214 |
+
|
215 |
+
# Action buttons
|
216 |
+
track_button = st.sidebar.button("Track Transactions", type="primary")
|
217 |
+
pattern_button = st.sidebar.button("Analyze Patterns")
|
218 |
+
if enable_manipulation_detection:
|
219 |
+
detect_button = st.sidebar.button("Detect Manipulation")
|
220 |
+
|
221 |
+
# Main content area
|
222 |
+
tab1, tab2, tab3, tab4, tab5 = st.tabs([
|
223 |
+
"Transactions", "Patterns", "Price Impact", "Alerts", "Reports"
|
224 |
+
])
|
225 |
+
|
226 |
+
with tab1:
|
227 |
+
st.header("Whale Transactions")
|
228 |
+
if track_button and wallet_addresses:
|
229 |
+
with st.spinner("Fetching whale transactions..."):
|
230 |
+
# Function to track whale transactions
|
231 |
+
def track_whale_transactions(wallets, start_date, end_date, threshold_value, threshold_type, token_symbol=None):
|
232 |
+
# Direct API call since CrewAI is temporarily disabled
|
233 |
+
try:
|
234 |
+
min_token_amount = None
|
235 |
+
min_usd_value = None
|
236 |
+
if threshold_type == "Token Amount":
|
237 |
+
min_token_amount = threshold_value
|
238 |
+
else:
|
239 |
+
min_usd_value = threshold_value
|
240 |
+
|
241 |
+
# Add pagination control to prevent infinite API requests
|
242 |
+
max_pages = 5 # Limit the number of pages to prevent excessive API calls
|
243 |
+
transactions = arbiscan_client.fetch_whale_transactions(
|
244 |
+
addresses=wallets,
|
245 |
+
min_token_amount=min_token_amount,
|
246 |
+
max_pages=5,
|
247 |
+
min_usd_value=min_usd_value
|
248 |
+
)
|
249 |
+
|
250 |
+
if transactions.empty:
|
251 |
+
st.warning("No transactions found for the specified addresses")
|
252 |
+
|
253 |
+
return transactions
|
254 |
+
except Exception as e:
|
255 |
+
st.error(f"Error fetching transactions: {str(e)}")
|
256 |
+
return pd.DataFrame()
|
257 |
+
|
258 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
259 |
+
|
260 |
+
# Use cached data or fetch new if not available
|
261 |
+
if st.session_state.transactions_data is None or track_button:
|
262 |
+
with st.spinner("Fetching transactions..."):
|
263 |
+
transactions = track_whale_transactions(
|
264 |
+
wallets=wallet_list,
|
265 |
+
start_date=start_date,
|
266 |
+
end_date=end_date,
|
267 |
+
threshold_value=threshold_value,
|
268 |
+
threshold_type=threshold_type,
|
269 |
+
token_symbol=token_symbol
|
270 |
+
)
|
271 |
+
# Store in session state
|
272 |
+
st.session_state.transactions_data = transactions
|
273 |
+
else:
|
274 |
+
transactions = st.session_state.transactions_data
|
275 |
+
|
276 |
+
if not transactions.empty:
|
277 |
+
st.success(f"Found {len(transactions)} transactions matching your criteria")
|
278 |
+
|
279 |
+
# Display transactions
|
280 |
+
if len(transactions) > 0:
|
281 |
+
st.dataframe(transactions, use_container_width=True)
|
282 |
+
|
283 |
+
# Add download button
|
284 |
+
csv = transactions.to_csv(index=False).encode('utf-8')
|
285 |
+
st.download_button(
|
286 |
+
"Download Transactions CSV",
|
287 |
+
csv,
|
288 |
+
"whale_transactions.csv",
|
289 |
+
"text/csv",
|
290 |
+
key='download-csv'
|
291 |
+
)
|
292 |
+
|
293 |
+
# Volume by day chart
|
294 |
+
st.subheader("Transaction Volume by Day")
|
295 |
+
try:
|
296 |
+
st.plotly_chart(visualizer.plot_volume_by_day(transactions), use_container_width=True)
|
297 |
+
except Exception as e:
|
298 |
+
st.error(f"Error generating volume chart: {str(e)}")
|
299 |
+
|
300 |
+
# Transaction flow visualization
|
301 |
+
st.subheader("Transaction Flow")
|
302 |
+
try:
|
303 |
+
flow_chart = visualizer.plot_transaction_flow(transactions)
|
304 |
+
st.plotly_chart(flow_chart, use_container_width=True)
|
305 |
+
except Exception as e:
|
306 |
+
st.error(f"Error generating flow chart: {str(e)}")
|
307 |
+
else:
|
308 |
+
st.warning("No transactions found matching your criteria. Try adjusting the parameters.")
|
309 |
+
else:
|
310 |
+
st.info("Enter wallet addresses and click 'Track Transactions' to view whale activity")
|
311 |
+
|
312 |
+
with tab2:
|
313 |
+
st.header("Trading Patterns")
|
314 |
+
if track_button and wallet_addresses:
|
315 |
+
with st.spinner("Analyzing trading patterns..."):
|
316 |
+
# Function to analyze trading patterns
|
317 |
+
def analyze_trading_patterns(wallets, start_date, end_date):
|
318 |
+
# Direct analysis
|
319 |
+
try:
|
320 |
+
transactions_df = arbiscan_client.fetch_whale_transactions(addresses=wallets, max_pages=5)
|
321 |
+
if transactions_df.empty:
|
322 |
+
st.warning("No transactions found for the specified addresses")
|
323 |
+
return []
|
324 |
+
|
325 |
+
return data_processor.identify_patterns(transactions_df)
|
326 |
+
except Exception as e:
|
327 |
+
st.error(f"Error analyzing trading patterns: {str(e)}")
|
328 |
+
return []
|
329 |
+
|
330 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
331 |
+
|
332 |
+
# Use cached data or fetch new if not available
|
333 |
+
if st.session_state.patterns_data is None or track_button:
|
334 |
+
with st.spinner("Analyzing trading patterns..."):
|
335 |
+
patterns = analyze_trading_patterns(
|
336 |
+
wallets=wallet_list,
|
337 |
+
start_date=start_date,
|
338 |
+
end_date=end_date
|
339 |
+
)
|
340 |
+
# Store in session state
|
341 |
+
st.session_state.patterns_data = patterns
|
342 |
+
else:
|
343 |
+
patterns = st.session_state.patterns_data
|
344 |
+
|
345 |
+
if patterns:
|
346 |
+
for i, pattern in enumerate(patterns):
|
347 |
+
pattern_card = st.container()
|
348 |
+
with pattern_card:
|
349 |
+
# Pattern header with name and risk profile
|
350 |
+
header_cols = st.columns([3, 1])
|
351 |
+
with header_cols[0]:
|
352 |
+
st.subheader(f"Pattern {i+1}: {pattern['name']}")
|
353 |
+
with header_cols[1]:
|
354 |
+
risk_color = "green"
|
355 |
+
if pattern.get('risk_profile') == "Medium":
|
356 |
+
risk_color = "orange"
|
357 |
+
elif pattern.get('risk_profile') in ["High", "Very High"]:
|
358 |
+
risk_color = "red"
|
359 |
+
st.markdown(f"<h5 style='color:{risk_color};'>Risk: {pattern.get('risk_profile', 'Unknown')}</h5>", unsafe_allow_html=True)
|
360 |
+
|
361 |
+
# Pattern description and details
|
362 |
+
st.markdown(f"**Description:** {pattern['description']}")
|
363 |
+
|
364 |
+
# Additional strategy information
|
365 |
+
if 'strategy' in pattern:
|
366 |
+
st.markdown(f"**Strategy:** {pattern['strategy']}")
|
367 |
+
|
368 |
+
# Time insight
|
369 |
+
if 'time_insight' in pattern:
|
370 |
+
st.info(pattern['time_insight'])
|
371 |
+
|
372 |
+
# Metrics
|
373 |
+
metric_cols = st.columns(3)
|
374 |
+
with metric_cols[0]:
|
375 |
+
st.markdown(f"**Occurrences:** {pattern['occurrence_count']} instances")
|
376 |
+
with metric_cols[1]:
|
377 |
+
st.markdown(f"**Confidence:** {pattern.get('confidence', 0):.2f}")
|
378 |
+
with metric_cols[2]:
|
379 |
+
st.markdown(f"**Volume:** {pattern.get('volume_metric', 'N/A')}")
|
380 |
+
|
381 |
+
# Display main chart first
|
382 |
+
if 'charts' in pattern and 'main' in pattern['charts']:
|
383 |
+
st.plotly_chart(pattern['charts']['main'], use_container_width=True)
|
384 |
+
elif 'chart_data' in pattern and pattern['chart_data'] is not None: # Fallback for old format
|
385 |
+
st.plotly_chart(pattern['chart_data'], use_container_width=True)
|
386 |
+
|
387 |
+
# Create two columns for additional charts
|
388 |
+
if 'charts' in pattern and len(pattern['charts']) > 1:
|
389 |
+
charts_col1, charts_col2 = st.columns(2)
|
390 |
+
|
391 |
+
# Hourly distribution chart
|
392 |
+
if 'hourly_distribution' in pattern['charts']:
|
393 |
+
with charts_col1:
|
394 |
+
st.plotly_chart(pattern['charts']['hourly_distribution'], use_container_width=True)
|
395 |
+
|
396 |
+
# Value distribution chart
|
397 |
+
if 'value_distribution' in pattern['charts']:
|
398 |
+
with charts_col2:
|
399 |
+
st.plotly_chart(pattern['charts']['value_distribution'], use_container_width=True)
|
400 |
+
|
401 |
+
# Advanced metrics in expander
|
402 |
+
if 'metrics' in pattern and pattern['metrics']:
|
403 |
+
with st.expander("Detailed Metrics"):
|
404 |
+
metrics_table = []
|
405 |
+
for k, v in pattern['metrics'].items():
|
406 |
+
if v is not None:
|
407 |
+
if isinstance(v, float):
|
408 |
+
metrics_table.append([k.replace('_', ' ').title(), f"{v:.4f}"])
|
409 |
+
else:
|
410 |
+
metrics_table.append([k.replace('_', ' ').title(), v])
|
411 |
+
|
412 |
+
if metrics_table:
|
413 |
+
st.table(pd.DataFrame(metrics_table, columns=["Metric", "Value"]))
|
414 |
+
|
415 |
+
# Display example transactions
|
416 |
+
if 'examples' in pattern and not pattern['examples'].empty:
|
417 |
+
with st.expander("Example Transactions"):
|
418 |
+
# Format the dataframe for better display
|
419 |
+
display_df = pattern['examples'].copy()
|
420 |
+
# Convert timestamp to readable format if needed
|
421 |
+
if 'timeStamp' in display_df.columns and not pd.api.types.is_datetime64_any_dtype(display_df['timeStamp']):
|
422 |
+
display_df['timeStamp'] = pd.to_datetime(display_df['timeStamp'], unit='s')
|
423 |
+
|
424 |
+
st.dataframe(display_df, use_container_width=True)
|
425 |
+
|
426 |
+
st.markdown("---")
|
427 |
+
else:
|
428 |
+
st.info("No significant trading patterns detected. Try expanding the date range or adding more addresses.")
|
429 |
+
else:
|
430 |
+
st.info("Track transactions to analyze trading patterns")
|
431 |
+
|
432 |
+
with tab3:
|
433 |
+
st.header("Price Impact Analysis")
|
434 |
+
if enable_price_impact and track_button and wallet_addresses:
|
435 |
+
with st.spinner("Analyzing price impact..."):
|
436 |
+
# Function to analyze price impact
|
437 |
+
def analyze_price_impact(wallets, start_date, end_date, lookback_minutes, lookahead_minutes):
|
438 |
+
# Direct analysis
|
439 |
+
transactions_df = arbiscan_client.fetch_whale_transactions(addresses=wallets, max_pages=5)
|
440 |
+
# Get token from first transaction
|
441 |
+
if not transactions_df.empty:
|
442 |
+
token_symbol = transactions_df.iloc[0].get('tokenSymbol', 'ETH')
|
443 |
+
# For each transaction, get price impact
|
444 |
+
price_impacts = {}
|
445 |
+
progress_bar = st.progress(0)
|
446 |
+
for idx, row in transactions_df.iterrows():
|
447 |
+
progress = int((idx + 1) / len(transactions_df) * 100)
|
448 |
+
progress_bar.progress(progress, text=f"Analyzing transaction {idx+1} of {len(transactions_df)}")
|
449 |
+
if 'timeStamp' in row:
|
450 |
+
try:
|
451 |
+
tx_time = datetime.fromtimestamp(int(row['timeStamp']))
|
452 |
+
impact_data = gemini_client.get_price_impact(
|
453 |
+
symbol=f"{token_symbol}USD",
|
454 |
+
transaction_time=tx_time,
|
455 |
+
lookback_minutes=lookback_minutes,
|
456 |
+
lookahead_minutes=lookahead_minutes
|
457 |
+
)
|
458 |
+
price_impacts[row['hash']] = impact_data
|
459 |
+
except Exception as e:
|
460 |
+
st.warning(f"Could not get price data for transaction: {str(e)}")
|
461 |
+
|
462 |
+
progress_bar.empty()
|
463 |
+
if price_impacts:
|
464 |
+
return data_processor.analyze_price_impact(transactions_df, price_impacts)
|
465 |
+
|
466 |
+
# Create an empty chart for the default case
|
467 |
+
empty_fig = go.Figure()
|
468 |
+
empty_fig.update_layout(
|
469 |
+
title="No Price Impact Data Available",
|
470 |
+
xaxis_title="Time",
|
471 |
+
yaxis_title="Price Impact (%)",
|
472 |
+
height=400,
|
473 |
+
template="plotly_white"
|
474 |
+
)
|
475 |
+
empty_fig.add_annotation(
|
476 |
+
text="No transactions found with price impact data",
|
477 |
+
showarrow=False,
|
478 |
+
font=dict(size=14)
|
479 |
+
)
|
480 |
+
|
481 |
+
return {
|
482 |
+
"avg_impact_pct": 0,
|
483 |
+
"max_impact_pct": 0,
|
484 |
+
"min_impact_pct": 0,
|
485 |
+
"significant_moves_count": 0,
|
486 |
+
"total_transactions": 0,
|
487 |
+
"transactions_with_impact": pd.DataFrame(),
|
488 |
+
"charts": {
|
489 |
+
"main_chart": empty_fig,
|
490 |
+
"impact_distribution": empty_fig,
|
491 |
+
"cumulative_impact": empty_fig,
|
492 |
+
"hourly_impact": empty_fig
|
493 |
+
},
|
494 |
+
"insights": [],
|
495 |
+
"impact_summary": "No price impact data available"
|
496 |
+
}
|
497 |
+
|
498 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
499 |
+
|
500 |
+
# Use cached data or fetch new if not available
|
501 |
+
if st.session_state.price_impact_data is None or track_button:
|
502 |
+
with st.spinner("Analyzing price impact..."):
|
503 |
+
impact_analysis = analyze_price_impact(
|
504 |
+
wallets=wallet_list,
|
505 |
+
start_date=start_date,
|
506 |
+
end_date=end_date,
|
507 |
+
lookback_minutes=lookback_minutes,
|
508 |
+
lookahead_minutes=lookahead_minutes
|
509 |
+
)
|
510 |
+
# Store in session state
|
511 |
+
st.session_state.price_impact_data = impact_analysis
|
512 |
+
else:
|
513 |
+
impact_analysis = st.session_state.price_impact_data
|
514 |
+
|
515 |
+
if impact_analysis:
|
516 |
+
# Display impact summary
|
517 |
+
if 'impact_summary' in impact_analysis:
|
518 |
+
st.info(impact_analysis['impact_summary'])
|
519 |
+
|
520 |
+
# Summary metrics in two rows
|
521 |
+
metrics_row1 = st.columns(4)
|
522 |
+
with metrics_row1[0]:
|
523 |
+
st.metric("Avg. Price Impact (%)", f"{impact_analysis.get('avg_impact_pct', 0):.2f}%")
|
524 |
+
with metrics_row1[1]:
|
525 |
+
st.metric("Max Impact (%)", f"{impact_analysis.get('max_impact_pct', 0):.2f}%")
|
526 |
+
with metrics_row1[2]:
|
527 |
+
st.metric("Min Impact (%)", f"{impact_analysis.get('min_impact_pct', 0):.2f}%")
|
528 |
+
with metrics_row1[3]:
|
529 |
+
st.metric("Std Dev (%)", f"{impact_analysis.get('std_impact_pct', 0):.2f}%")
|
530 |
+
|
531 |
+
metrics_row2 = st.columns(4)
|
532 |
+
with metrics_row2[0]:
|
533 |
+
st.metric("Significant Moves", impact_analysis.get('significant_moves_count', 0))
|
534 |
+
with metrics_row2[1]:
|
535 |
+
st.metric("High Impact Moves", impact_analysis.get('high_impact_moves_count', 0))
|
536 |
+
with metrics_row2[2]:
|
537 |
+
st.metric("Positive/Negative", f"{impact_analysis.get('positive_impacts_count', 0)}/{impact_analysis.get('negative_impacts_count', 0)}")
|
538 |
+
with metrics_row2[3]:
|
539 |
+
st.metric("Total Transactions", impact_analysis.get('total_transactions', 0))
|
540 |
+
|
541 |
+
# Display insights if available
|
542 |
+
if 'insights' in impact_analysis and impact_analysis['insights']:
|
543 |
+
st.subheader("Key Insights")
|
544 |
+
for insight in impact_analysis['insights']:
|
545 |
+
st.markdown(f"**{insight['title']}**: {insight['description']}")
|
546 |
+
|
547 |
+
# Display the main chart
|
548 |
+
if 'charts' in impact_analysis and 'main_chart' in impact_analysis['charts']:
|
549 |
+
st.subheader("Price Impact Over Time")
|
550 |
+
st.plotly_chart(impact_analysis['charts']['main_chart'], use_container_width=True)
|
551 |
+
|
552 |
+
# Create two columns for secondary charts
|
553 |
+
col1, col2 = st.columns(2)
|
554 |
+
|
555 |
+
# Distribution chart
|
556 |
+
if 'charts' in impact_analysis and 'impact_distribution' in impact_analysis['charts']:
|
557 |
+
with col1:
|
558 |
+
st.plotly_chart(impact_analysis['charts']['impact_distribution'], use_container_width=True)
|
559 |
+
|
560 |
+
# Cumulative impact chart
|
561 |
+
if 'charts' in impact_analysis and 'cumulative_impact' in impact_analysis['charts']:
|
562 |
+
with col2:
|
563 |
+
st.plotly_chart(impact_analysis['charts']['cumulative_impact'], use_container_width=True)
|
564 |
+
|
565 |
+
# Hourly impact chart
|
566 |
+
if 'charts' in impact_analysis and 'hourly_impact' in impact_analysis['charts']:
|
567 |
+
st.plotly_chart(impact_analysis['charts']['hourly_impact'], use_container_width=True)
|
568 |
+
|
569 |
+
# Detailed transactions with impact
|
570 |
+
if not impact_analysis['transactions_with_impact'].empty:
|
571 |
+
st.subheader("Transactions with Price Impact")
|
572 |
+
# Convert numeric columns to have 2 decimal places for better display
|
573 |
+
display_df = impact_analysis['transactions_with_impact'].copy()
|
574 |
+
for col in ['impact_pct', 'pre_price', 'post_price', 'cumulative_impact']:
|
575 |
+
if col in display_df.columns:
|
576 |
+
display_df[col] = display_df[col].apply(lambda x: f"{float(x):.2f}%" if pd.notnull(x) else "N/A")
|
577 |
+
|
578 |
+
st.dataframe(display_df, use_container_width=True)
|
579 |
+
else:
|
580 |
+
st.info("No transaction-specific price impact data available")
|
581 |
+
else:
|
582 |
+
st.info("No price impact data available for the given parameters")
|
583 |
+
else:
|
584 |
+
st.info("Enable Price Impact Analysis and track transactions to see price effects")
|
585 |
+
|
586 |
+
with tab4:
|
587 |
+
st.header("Manipulation Alerts")
|
588 |
+
if enable_manipulation_detection and detect_button and wallet_addresses:
|
589 |
+
with st.spinner("Detecting potential manipulation..."):
|
590 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
591 |
+
|
592 |
+
# Function to detect manipulation
|
593 |
+
def detect_manipulation(wallets, start_date, end_date, sensitivity):
|
594 |
+
try:
|
595 |
+
transactions_df = arbiscan_client.fetch_whale_transactions(addresses=wallets, max_pages=5)
|
596 |
+
if transactions_df.empty:
|
597 |
+
st.warning("No transactions found for the specified addresses")
|
598 |
+
return []
|
599 |
+
|
600 |
+
pump_dump = detection.detect_pump_and_dump(transactions_df, sensitivity)
|
601 |
+
wash_trades = detection.detect_wash_trading(transactions_df, wallets, sensitivity)
|
602 |
+
return pump_dump + wash_trades
|
603 |
+
except Exception as e:
|
604 |
+
st.error(f"Error detecting manipulation: {str(e)}")
|
605 |
+
return []
|
606 |
+
|
607 |
+
alerts = detect_manipulation(
|
608 |
+
wallets=wallet_list,
|
609 |
+
start_date=start_date,
|
610 |
+
end_date=end_date,
|
611 |
+
sensitivity=sensitivity
|
612 |
+
)
|
613 |
+
|
614 |
+
if alerts:
|
615 |
+
for i, alert in enumerate(alerts):
|
616 |
+
alert_color = "red" if alert['risk_level'] == "High" else "orange" if alert['risk_level'] == "Medium" else "blue"
|
617 |
+
|
618 |
+
with st.expander(f" {alert['type']} - Risk: {alert['risk_level']}", expanded=i==0):
|
619 |
+
st.markdown(f"<h4 style='color:{alert_color}'>{alert['title']}</h4>", unsafe_allow_html=True)
|
620 |
+
st.write(f"**Description:** {alert['description']}")
|
621 |
+
st.write(f"**Detection Time:** {alert['detection_time']}")
|
622 |
+
st.write(f"**Involved Addresses:** {', '.join(alert['addresses'])}")
|
623 |
+
|
624 |
+
# Display evidence
|
625 |
+
if 'evidence' in alert and alert['evidence'] is not None and not (isinstance(alert['evidence'], pd.DataFrame) and alert['evidence'].empty):
|
626 |
+
st.subheader("Evidence")
|
627 |
+
try:
|
628 |
+
evidence_df = alert['evidence']
|
629 |
+
if isinstance(evidence_df, str):
|
630 |
+
# Try to convert from JSON string if needed
|
631 |
+
evidence_df = pd.read_json(evidence_df)
|
632 |
+
st.dataframe(evidence_df, use_container_width=True)
|
633 |
+
except Exception as e:
|
634 |
+
st.error(f"Error displaying evidence: {str(e)}")
|
635 |
+
|
636 |
+
# Display chart if available
|
637 |
+
if 'chart' in alert and alert['chart'] is not None:
|
638 |
+
try:
|
639 |
+
st.plotly_chart(alert['chart'], use_container_width=True)
|
640 |
+
except Exception as e:
|
641 |
+
st.error(f"Error displaying chart: {str(e)}")
|
642 |
+
else:
|
643 |
+
st.success("No manipulation tactics detected for the given parameters")
|
644 |
+
else:
|
645 |
+
st.info("Enable Manipulation Detection and click 'Detect Manipulation' to scan for suspicious activity")
|
646 |
+
|
647 |
+
with tab5:
|
648 |
+
st.header("Reports & Visualizations")
|
649 |
+
|
650 |
+
# Report type selection
|
651 |
+
report_type = st.selectbox(
|
652 |
+
"Select Report Type",
|
653 |
+
["Transaction Summary", "Pattern Analysis", "Price Impact", "Manipulation Detection", "Complete Analysis"]
|
654 |
+
)
|
655 |
+
|
656 |
+
# Export format
|
657 |
+
export_format = st.radio(
|
658 |
+
"Export Format",
|
659 |
+
["CSV", "PDF", "PNG"],
|
660 |
+
horizontal=True
|
661 |
+
)
|
662 |
+
|
663 |
+
# Generate report button
|
664 |
+
if st.button("Generate Report"):
|
665 |
+
if wallet_addresses:
|
666 |
+
with st.spinner("Generating report..."):
|
667 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
668 |
+
|
669 |
+
if CREW_ENABLED and crew_system is not None:
|
670 |
+
try:
|
671 |
+
with st.spinner("Generating AI analysis report..."):
|
672 |
+
# Check if crew_system has llm attribute defined
|
673 |
+
if not hasattr(crew_system, 'llm') or crew_system.llm is None:
|
674 |
+
raise ValueError("LLM not initialized in crew system")
|
675 |
+
|
676 |
+
report = crew_system.generate_market_manipulation_report(wallet_addresses=wallet_list)
|
677 |
+
st.markdown(f"## AI Analysis Report")
|
678 |
+
st.markdown(report['content'])
|
679 |
+
|
680 |
+
if 'charts' in report and report['charts']:
|
681 |
+
for i, chart in enumerate(report['charts']):
|
682 |
+
st.plotly_chart(chart, use_container_width=True)
|
683 |
+
except Exception as e:
|
684 |
+
st.error(f"CrewAI report generation failed: {str(e)}")
|
685 |
+
st.warning("Using direct analysis instead")
|
686 |
+
|
687 |
+
# Fallback to direct analysis
|
688 |
+
with st.spinner("Generating basic analysis..."):
|
689 |
+
insights = detection.generate_manipulation_insights(transactions=st.session_state.transactions_data)
|
690 |
+
st.markdown(f"## Potential Manipulation Insights")
|
691 |
+
|
692 |
+
for insight in insights:
|
693 |
+
st.markdown(f"**{insight['title']}**\n{insight['description']}")
|
694 |
+
else:
|
695 |
+
st.error("Failed to generate report: CrewAI is not enabled")
|
696 |
+
else:
|
697 |
+
st.error("Please enter wallet addresses to generate a report")
|
698 |
+
|
699 |
+
# Footer with instructions
|
700 |
+
st.markdown("---")
|
701 |
+
with st.expander("How to Use"):
|
702 |
+
st.markdown("""
|
703 |
+
### Typical Workflow
|
704 |
+
|
705 |
+
1. **Input wallet addresses** in the sidebar - these are the whale wallets you want to track
|
706 |
+
2. **Set the minimum threshold** for transaction size (token amount or USD value)
|
707 |
+
3. **Select time period** for analysis
|
708 |
+
4. **Click 'Track Transactions'** to see large transfers for these wallets
|
709 |
+
5. **Enable additional analysis** like pattern recognition or manipulation detection
|
710 |
+
6. **Export reports** for further analysis or record-keeping
|
711 |
+
|
712 |
+
### API Keys
|
713 |
+
|
714 |
+
This app requires two API keys to function properly:
|
715 |
+
- **ARBISCAN_API_KEY** - For accessing Arbitrum blockchain data
|
716 |
+
- **GEMINI_API_KEY** - For real-time token price data
|
717 |
+
|
718 |
+
These should be stored in a `.env` file in the project root.
|
719 |
+
""")
|
modules/__init__.py
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
|
modules/__pycache__/__init__.cpython-312.pyc
ADDED
Binary file (157 Bytes). View file
|
|
modules/__pycache__/api_client.cpython-312.pyc
ADDED
Binary file (30.1 kB). View file
|
|
modules/__pycache__/crew_system.cpython-312.pyc
ADDED
Binary file (36.2 kB). View file
|
|
modules/__pycache__/crew_tools.cpython-312.pyc
ADDED
Binary file (18.3 kB). View file
|
|
modules/__pycache__/data_processor.cpython-312.pyc
ADDED
Binary file (44.1 kB). View file
|
|
modules/__pycache__/detection.cpython-312.pyc
ADDED
Binary file (22.3 kB). View file
|
|
modules/__pycache__/visualizer.cpython-312.pyc
ADDED
Binary file (23.2 kB). View file
|
|
modules/api_client.py
ADDED
@@ -0,0 +1,768 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import requests
|
2 |
+
import json
|
3 |
+
import time
|
4 |
+
import logging
|
5 |
+
from datetime import datetime
|
6 |
+
import pandas as pd
|
7 |
+
from typing import Dict, List, Optional, Union, Any
|
8 |
+
|
9 |
+
class ArbiscanClient:
|
10 |
+
"""
|
11 |
+
Client to interact with the Arbiscan API for fetching on-chain data from Arbitrum
|
12 |
+
"""
|
13 |
+
|
14 |
+
def __init__(self, api_key: str):
|
15 |
+
self.api_key = api_key
|
16 |
+
self.base_url = "https://api.arbiscan.io/api"
|
17 |
+
self.rate_limit_delay = 0.2 # Delay between API calls to avoid rate limiting (200ms)
|
18 |
+
|
19 |
+
# Add caching to improve performance
|
20 |
+
self._transaction_cache = {}
|
21 |
+
self._last_api_call_time = 0
|
22 |
+
|
23 |
+
# Configure debug logging - set to True for verbose output, False for minimal output
|
24 |
+
self.verbose_debug = False
|
25 |
+
|
26 |
+
def _make_request(self, params: Dict[str, str]) -> Dict[str, Any]:
|
27 |
+
"""
|
28 |
+
Make a request to the Arbiscan API with rate limiting
|
29 |
+
"""
|
30 |
+
params["apikey"] = self.api_key
|
31 |
+
|
32 |
+
# Implement rate limiting
|
33 |
+
current_time = time.time()
|
34 |
+
time_since_last_call = current_time - self._last_api_call_time
|
35 |
+
if time_since_last_call < self.rate_limit_delay:
|
36 |
+
time.sleep(self.rate_limit_delay - time_since_last_call)
|
37 |
+
self._last_api_call_time = time.time()
|
38 |
+
|
39 |
+
try:
|
40 |
+
# Log the request details but only in verbose mode
|
41 |
+
if self.verbose_debug:
|
42 |
+
debug_params = params.copy()
|
43 |
+
debug_params.pop("apikey", None)
|
44 |
+
logging.debug(f"API Request: {self.base_url}")
|
45 |
+
logging.debug(f"Params: {json.dumps(debug_params, indent=2)}")
|
46 |
+
|
47 |
+
response = requests.get(self.base_url, params=params)
|
48 |
+
|
49 |
+
# Print response status and URL only in verbose mode
|
50 |
+
if self.verbose_debug:
|
51 |
+
logging.debug(f"Response Status: {response.status_code}")
|
52 |
+
logging.debug(f"Full URL: {response.url.replace(self.api_key, 'API_KEY_REDACTED')}")
|
53 |
+
|
54 |
+
response.raise_for_status()
|
55 |
+
|
56 |
+
# Parse the JSON response
|
57 |
+
json_data = response.json()
|
58 |
+
|
59 |
+
# Log the response structure but only in verbose mode
|
60 |
+
if self.verbose_debug:
|
61 |
+
result_preview = str(json_data.get('result', ''))[:100] + '...' if len(str(json_data.get('result', ''))) > 100 else str(json_data.get('result', ''))
|
62 |
+
logging.debug(f"Response Status: {json_data.get('status')}")
|
63 |
+
logging.debug(f"Response Message: {json_data.get('message', 'No message')}")
|
64 |
+
logging.debug(f"Result Preview: {result_preview}")
|
65 |
+
|
66 |
+
# Check for API-level errors in the response
|
67 |
+
status = json_data.get('status')
|
68 |
+
message = json_data.get('message', 'No message')
|
69 |
+
if status == '0' and message != 'No transactions found':
|
70 |
+
logging.warning(f"API Error: {message}")
|
71 |
+
|
72 |
+
return json_data
|
73 |
+
|
74 |
+
except requests.exceptions.HTTPError as e:
|
75 |
+
logging.error(f"HTTP Error in API Request: {e.response.status_code}")
|
76 |
+
raise
|
77 |
+
|
78 |
+
except requests.exceptions.ConnectionError as e:
|
79 |
+
logging.error(f"Connection Error in API Request: {str(e)}")
|
80 |
+
raise
|
81 |
+
|
82 |
+
except requests.exceptions.Timeout as e:
|
83 |
+
logging.error(f"Timeout in API Request: {str(e)}")
|
84 |
+
raise
|
85 |
+
|
86 |
+
except requests.exceptions.RequestException as e:
|
87 |
+
logging.error(f"API Request failed: {str(e)}")
|
88 |
+
print(f"ERROR - URL: {self.base_url}")
|
89 |
+
print(f"ERROR - Method: {params.get('module')}/{params.get('action')}")
|
90 |
+
return {"status": "0", "message": f"Error: {str(e)}", "result": []}
|
91 |
+
|
92 |
+
def get_eth_balance(self, address: str) -> float:
|
93 |
+
"""
|
94 |
+
Get the ETH balance of an address
|
95 |
+
|
96 |
+
Args:
|
97 |
+
address: Wallet address
|
98 |
+
|
99 |
+
Returns:
|
100 |
+
ETH balance as a float
|
101 |
+
"""
|
102 |
+
params = {
|
103 |
+
"module": "account",
|
104 |
+
"action": "balance",
|
105 |
+
"address": address,
|
106 |
+
"tag": "latest"
|
107 |
+
}
|
108 |
+
|
109 |
+
result = self._make_request(params)
|
110 |
+
|
111 |
+
if result.get("status") == "1":
|
112 |
+
# Convert wei to ETH
|
113 |
+
wei_balance = int(result.get("result", "0"))
|
114 |
+
eth_balance = wei_balance / 10**18
|
115 |
+
return eth_balance
|
116 |
+
else:
|
117 |
+
return 0.0
|
118 |
+
|
119 |
+
def get_token_balance(self, address: str, token_address: str) -> float:
|
120 |
+
"""
|
121 |
+
Get the token balance of an address for a specific token
|
122 |
+
|
123 |
+
Args:
|
124 |
+
address: Wallet address
|
125 |
+
token_address: Token contract address
|
126 |
+
|
127 |
+
Returns:
|
128 |
+
Token balance as a float
|
129 |
+
"""
|
130 |
+
params = {
|
131 |
+
"module": "account",
|
132 |
+
"action": "tokenbalance",
|
133 |
+
"address": address,
|
134 |
+
"contractaddress": token_address,
|
135 |
+
"tag": "latest"
|
136 |
+
}
|
137 |
+
|
138 |
+
result = self._make_request(params)
|
139 |
+
|
140 |
+
if result.get("status") == "1":
|
141 |
+
# Get token decimals and convert to proper amount
|
142 |
+
decimals = self.get_token_decimals(token_address)
|
143 |
+
raw_balance = int(result.get("result", "0"))
|
144 |
+
token_balance = raw_balance / 10**decimals
|
145 |
+
return token_balance
|
146 |
+
else:
|
147 |
+
return 0.0
|
148 |
+
|
149 |
+
def get_token_decimals(self, token_address: str) -> int:
|
150 |
+
"""
|
151 |
+
Get the number of decimals for a token
|
152 |
+
|
153 |
+
Args:
|
154 |
+
token_address: Token contract address
|
155 |
+
|
156 |
+
Returns:
|
157 |
+
Number of decimals (default: 18)
|
158 |
+
"""
|
159 |
+
params = {
|
160 |
+
"module": "token",
|
161 |
+
"action": "getToken",
|
162 |
+
"contractaddress": token_address
|
163 |
+
}
|
164 |
+
|
165 |
+
result = self._make_request(params)
|
166 |
+
|
167 |
+
if result.get("status") == "1":
|
168 |
+
token_info = result.get("result", {})
|
169 |
+
return int(token_info.get("divisor", "18"))
|
170 |
+
else:
|
171 |
+
# Default to 18 decimals (most ERC-20 tokens)
|
172 |
+
return 18
|
173 |
+
|
174 |
+
def get_token_transfers(self,
|
175 |
+
address: str,
|
176 |
+
contract_address: Optional[str] = None,
|
177 |
+
start_block: int = 0,
|
178 |
+
end_block: int = 99999999,
|
179 |
+
page: int = 1,
|
180 |
+
offset: int = 100,
|
181 |
+
sort: str = "desc") -> List[Dict[str, Any]]:
|
182 |
+
"""
|
183 |
+
Get token transfers for an address
|
184 |
+
|
185 |
+
Args:
|
186 |
+
address: Wallet address
|
187 |
+
contract_address: Optional token contract address to filter by
|
188 |
+
start_block: Starting block number
|
189 |
+
end_block: Ending block number
|
190 |
+
page: Page number
|
191 |
+
offset: Number of results per page
|
192 |
+
sort: Sort order ("asc" or "desc")
|
193 |
+
|
194 |
+
Returns:
|
195 |
+
List of token transfers
|
196 |
+
"""
|
197 |
+
params = {
|
198 |
+
"module": "account",
|
199 |
+
"action": "tokentx",
|
200 |
+
"address": address,
|
201 |
+
"startblock": str(start_block),
|
202 |
+
"endblock": str(end_block),
|
203 |
+
"page": str(page),
|
204 |
+
"offset": str(offset),
|
205 |
+
"sort": sort
|
206 |
+
}
|
207 |
+
|
208 |
+
# Add contract address if specified
|
209 |
+
if contract_address:
|
210 |
+
params["contractaddress"] = contract_address
|
211 |
+
|
212 |
+
result = self._make_request(params)
|
213 |
+
|
214 |
+
if result.get("status") == "1":
|
215 |
+
return result.get("result", [])
|
216 |
+
else:
|
217 |
+
message = result.get("message", "Unknown error")
|
218 |
+
if "No transactions found" in message:
|
219 |
+
return []
|
220 |
+
else:
|
221 |
+
logging.warning(f"Error fetching token transfers: {message}")
|
222 |
+
return []
|
223 |
+
|
224 |
+
def fetch_all_token_transfers(self,
|
225 |
+
address: str,
|
226 |
+
contract_address: Optional[str] = None,
|
227 |
+
start_block: int = 0,
|
228 |
+
end_block: int = 99999999,
|
229 |
+
max_pages: int = 10) -> List[Dict[str, Any]]:
|
230 |
+
"""
|
231 |
+
Fetch all token transfers for an address, paginating through results
|
232 |
+
|
233 |
+
Args:
|
234 |
+
address: Wallet address
|
235 |
+
contract_address: Optional token contract address to filter by
|
236 |
+
start_block: Starting block number
|
237 |
+
end_block: Ending block number
|
238 |
+
max_pages: Maximum number of pages to fetch
|
239 |
+
|
240 |
+
Returns:
|
241 |
+
List of all token transfers
|
242 |
+
"""
|
243 |
+
all_transfers = []
|
244 |
+
offset = 100 # Results per page (API limit)
|
245 |
+
|
246 |
+
for page in range(1, max_pages + 1):
|
247 |
+
try:
|
248 |
+
transfers = self.get_token_transfers(
|
249 |
+
address=address,
|
250 |
+
contract_address=contract_address,
|
251 |
+
start_block=start_block,
|
252 |
+
end_block=end_block,
|
253 |
+
page=page,
|
254 |
+
offset=offset
|
255 |
+
)
|
256 |
+
|
257 |
+
# No more transfers, break the loop
|
258 |
+
if not transfers:
|
259 |
+
break
|
260 |
+
|
261 |
+
all_transfers.extend(transfers)
|
262 |
+
|
263 |
+
# If we got fewer results than the offset, we've reached the end
|
264 |
+
if len(transfers) < offset:
|
265 |
+
break
|
266 |
+
|
267 |
+
except Exception as e:
|
268 |
+
logging.error(f"Error fetching page {page} of token transfers: {str(e)}")
|
269 |
+
break
|
270 |
+
|
271 |
+
return all_transfers
|
272 |
+
|
273 |
+
def fetch_whale_transactions(self,
|
274 |
+
addresses: List[str],
|
275 |
+
token_address: Optional[str] = None,
|
276 |
+
min_token_amount: Optional[float] = None,
|
277 |
+
min_usd_value: Optional[float] = None,
|
278 |
+
start_block: int = 0,
|
279 |
+
end_block: int = 99999999,
|
280 |
+
max_pages: int = 10) -> pd.DataFrame:
|
281 |
+
"""
|
282 |
+
Fetch whale transactions for a list of addresses
|
283 |
+
|
284 |
+
Args:
|
285 |
+
addresses: List of wallet addresses
|
286 |
+
token_address: Optional token contract address to filter by
|
287 |
+
min_token_amount: Minimum token amount to be considered a whale transaction
|
288 |
+
min_usd_value: Minimum USD value to be considered a whale transaction
|
289 |
+
start_block: Starting block number
|
290 |
+
end_block: Ending block number
|
291 |
+
max_pages: Maximum number of pages to fetch per address (default: 10)
|
292 |
+
|
293 |
+
Returns:
|
294 |
+
DataFrame of whale transactions
|
295 |
+
"""
|
296 |
+
try:
|
297 |
+
# Create a cache key based on parameters
|
298 |
+
cache_key = f"{','.join(addresses)}_{token_address}_{min_token_amount}_{min_usd_value}_{start_block}_{end_block}_{max_pages}"
|
299 |
+
|
300 |
+
# Check if we have cached results
|
301 |
+
if cache_key in self._transaction_cache:
|
302 |
+
logging.info(f"Using cached transactions for {len(addresses)} addresses")
|
303 |
+
return self._transaction_cache[cache_key]
|
304 |
+
|
305 |
+
all_transfers = []
|
306 |
+
|
307 |
+
logging.info(f"Fetching whale transactions for {len(addresses)} addresses")
|
308 |
+
logging.info(f"Token address filter: {token_address if token_address else 'None'}")
|
309 |
+
logging.info(f"Min token amount: {min_token_amount}")
|
310 |
+
logging.info(f"Min USD value: {min_usd_value}")
|
311 |
+
|
312 |
+
for i, address in enumerate(addresses):
|
313 |
+
try:
|
314 |
+
logging.info(f"Processing address {i+1}/{len(addresses)}: {address}")
|
315 |
+
|
316 |
+
# Create address-specific cache key
|
317 |
+
addr_cache_key = f"{address}_{token_address}_{start_block}_{end_block}_{max_pages}"
|
318 |
+
|
319 |
+
# Check if we have cached results for this specific address
|
320 |
+
if addr_cache_key in self._transaction_cache:
|
321 |
+
transfers = self._transaction_cache[addr_cache_key]
|
322 |
+
logging.info(f"Using cached {len(transfers)} transfers for address {address}")
|
323 |
+
else:
|
324 |
+
transfers = self.fetch_all_token_transfers(
|
325 |
+
address=address,
|
326 |
+
contract_address=token_address,
|
327 |
+
start_block=start_block,
|
328 |
+
end_block=end_block,
|
329 |
+
max_pages=max_pages
|
330 |
+
)
|
331 |
+
logging.info(f"Found {len(transfers)} transfers for address {address}")
|
332 |
+
# Cache the results for this address
|
333 |
+
self._transaction_cache[addr_cache_key] = transfers
|
334 |
+
|
335 |
+
all_transfers.extend(transfers)
|
336 |
+
except Exception as e:
|
337 |
+
logging.error(f"Failed to fetch transactions for address {address}: {str(e)}")
|
338 |
+
continue
|
339 |
+
|
340 |
+
logging.info(f"Total transfers found: {len(all_transfers)}")
|
341 |
+
|
342 |
+
if not all_transfers:
|
343 |
+
logging.warning("No whale transactions found for the specified addresses")
|
344 |
+
return pd.DataFrame()
|
345 |
+
|
346 |
+
# Convert to DataFrame
|
347 |
+
logging.info("Converting transfers to DataFrame")
|
348 |
+
df = pd.DataFrame(all_transfers)
|
349 |
+
|
350 |
+
# Log the column names
|
351 |
+
logging.info(f"DataFrame created with {len(df)} rows and {len(df.columns)} columns")
|
352 |
+
logging.info(f"Columns: {', '.join(df.columns[:5])}...")
|
353 |
+
|
354 |
+
# Apply token amount filter if specified
|
355 |
+
if min_token_amount is not None:
|
356 |
+
logging.info(f"Applying min token amount filter: {min_token_amount}")
|
357 |
+
# Convert to float and then filter
|
358 |
+
df['tokenAmount'] = df['value'].astype(float) / (10 ** df['tokenDecimal'].astype(int))
|
359 |
+
df = df[df['tokenAmount'] >= min_token_amount]
|
360 |
+
logging.info(f"After token amount filtering: {len(df)}/{len(all_transfers)} rows remain")
|
361 |
+
|
362 |
+
# Apply USD value filter if specified (this would require price data)
|
363 |
+
if min_usd_value is not None and 'tokenAmount' in df.columns:
|
364 |
+
logging.info(f"USD value filtering is not implemented yet")
|
365 |
+
# This would require token price data, which we don't have yet
|
366 |
+
# df = df[df['usd_value'] >= min_usd_value]
|
367 |
+
|
368 |
+
# Convert timestamp to datetime
|
369 |
+
if 'timeStamp' in df.columns:
|
370 |
+
logging.info("Converting timestamp to datetime")
|
371 |
+
try:
|
372 |
+
df['timeStamp'] = pd.to_datetime(df['timeStamp'].astype(float), unit='s')
|
373 |
+
except Exception as e:
|
374 |
+
logging.error(f"Error converting timestamp: {str(e)}")
|
375 |
+
|
376 |
+
logging.info(f"Final DataFrame has {len(df)} rows")
|
377 |
+
|
378 |
+
# Cache the final result
|
379 |
+
self._transaction_cache[cache_key] = df
|
380 |
+
|
381 |
+
return df
|
382 |
+
|
383 |
+
except Exception as e:
|
384 |
+
logging.error(f"Error fetching whale transactions: {str(e)}")
|
385 |
+
return pd.DataFrame()
|
386 |
+
|
387 |
+
def get_internal_transactions(self,
|
388 |
+
address: str,
|
389 |
+
start_block: int = 0,
|
390 |
+
end_block: int = 99999999,
|
391 |
+
page: int = 1,
|
392 |
+
offset: int = 100,
|
393 |
+
sort: str = "desc") -> List[Dict[str, Any]]:
|
394 |
+
"""
|
395 |
+
Get internal transactions for an address
|
396 |
+
|
397 |
+
Args:
|
398 |
+
address: Wallet address
|
399 |
+
start_block: Starting block number
|
400 |
+
end_block: Ending block number
|
401 |
+
page: Page number
|
402 |
+
offset: Number of results per page
|
403 |
+
sort: Sort order ("asc" or "desc")
|
404 |
+
|
405 |
+
Returns:
|
406 |
+
List of internal transactions
|
407 |
+
"""
|
408 |
+
params = {
|
409 |
+
"module": "account",
|
410 |
+
"action": "txlistinternal",
|
411 |
+
"address": address,
|
412 |
+
"startblock": str(start_block),
|
413 |
+
"endblock": str(end_block),
|
414 |
+
"page": str(page),
|
415 |
+
"offset": str(offset),
|
416 |
+
"sort": sort
|
417 |
+
}
|
418 |
+
|
419 |
+
result = self._make_request(params)
|
420 |
+
|
421 |
+
if result.get("status") == "1":
|
422 |
+
return result.get("result", [])
|
423 |
+
else:
|
424 |
+
message = result.get("message", "Unknown error")
|
425 |
+
if "No transactions found" in message:
|
426 |
+
return []
|
427 |
+
else:
|
428 |
+
logging.warning(f"Error fetching internal transactions: {message}")
|
429 |
+
return []
|
430 |
+
|
431 |
+
|
432 |
+
class GeminiClient:
|
433 |
+
"""
|
434 |
+
Client to interact with the Gemini API for fetching token prices
|
435 |
+
"""
|
436 |
+
|
437 |
+
def __init__(self, api_key: str):
|
438 |
+
self.api_key = api_key
|
439 |
+
self.base_url = "https://api.gemini.com/v1"
|
440 |
+
# Add caching to avoid repetitive API calls
|
441 |
+
self._price_cache = {}
|
442 |
+
# Track API errors to avoid flooding logs
|
443 |
+
self._error_count = {}
|
444 |
+
self._last_api_call = 0 # For rate limiting
|
445 |
+
|
446 |
+
def get_current_price(self, symbol: str) -> Optional[float]:
|
447 |
+
"""
|
448 |
+
Get the current price of a token
|
449 |
+
|
450 |
+
Args:
|
451 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
452 |
+
|
453 |
+
Returns:
|
454 |
+
Current price as a float or None if not found
|
455 |
+
"""
|
456 |
+
try:
|
457 |
+
url = f"{self.base_url}/pubticker/{symbol}"
|
458 |
+
response = requests.get(url)
|
459 |
+
response.raise_for_status()
|
460 |
+
data = response.json()
|
461 |
+
return float(data.get("last", 0))
|
462 |
+
except requests.exceptions.RequestException as e:
|
463 |
+
logging.error(f"Error fetching price from Gemini API: {e}")
|
464 |
+
return None
|
465 |
+
|
466 |
+
def get_historical_prices(self,
|
467 |
+
symbol: str,
|
468 |
+
start_time: datetime,
|
469 |
+
end_time: datetime) -> Optional[pd.DataFrame]:
|
470 |
+
"""
|
471 |
+
Get historical prices for a token within a time range
|
472 |
+
|
473 |
+
Args:
|
474 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
475 |
+
start_time: Start datetime
|
476 |
+
end_time: End datetime
|
477 |
+
|
478 |
+
Returns:
|
479 |
+
DataFrame of historical prices with timestamps
|
480 |
+
"""
|
481 |
+
# Implement simple rate limiting
|
482 |
+
current_time = time.time()
|
483 |
+
if current_time - self._last_api_call < 0.05: # 50ms minimum between calls
|
484 |
+
time.sleep(0.05)
|
485 |
+
self._last_api_call = current_time
|
486 |
+
|
487 |
+
# Create a cache key based on the parameters
|
488 |
+
cache_key = f"{symbol}_{int(start_time.timestamp())}_{int(end_time.timestamp())}"
|
489 |
+
|
490 |
+
# Check if we already have this data cached
|
491 |
+
if cache_key in self._price_cache:
|
492 |
+
return self._price_cache[cache_key]
|
493 |
+
|
494 |
+
try:
|
495 |
+
# Convert datetime to milliseconds
|
496 |
+
start_ms = int(start_time.timestamp() * 1000)
|
497 |
+
end_ms = int(end_time.timestamp() * 1000)
|
498 |
+
|
499 |
+
url = f"{self.base_url}/trades/{symbol}"
|
500 |
+
params = {
|
501 |
+
"limit_trades": 500,
|
502 |
+
"timestamp": start_ms
|
503 |
+
}
|
504 |
+
|
505 |
+
# Check if we've seen too many errors for this symbol
|
506 |
+
error_key = f"error_{symbol}"
|
507 |
+
if self._error_count.get(error_key, 0) > 10:
|
508 |
+
# If we've already had too many errors for this symbol, don't try again
|
509 |
+
return None
|
510 |
+
|
511 |
+
response = requests.get(url, params=params)
|
512 |
+
response.raise_for_status()
|
513 |
+
trades = response.json()
|
514 |
+
|
515 |
+
# Reset error count on success
|
516 |
+
self._error_count[error_key] = 0
|
517 |
+
|
518 |
+
# Filter trades within the time range
|
519 |
+
filtered_trades = [
|
520 |
+
trade for trade in trades
|
521 |
+
if start_ms <= trade.get("timestampms", 0) <= end_ms
|
522 |
+
]
|
523 |
+
|
524 |
+
if not filtered_trades:
|
525 |
+
# Cache negative result to avoid future lookups
|
526 |
+
self._price_cache[cache_key] = None
|
527 |
+
return None
|
528 |
+
|
529 |
+
# Convert to DataFrame
|
530 |
+
df = pd.DataFrame(filtered_trades)
|
531 |
+
|
532 |
+
# Convert timestamp to datetime
|
533 |
+
df['timestamp'] = pd.to_datetime(df['timestampms'], unit='ms')
|
534 |
+
|
535 |
+
# Select and rename columns
|
536 |
+
result_df = df[['timestamp', 'price', 'amount']].copy()
|
537 |
+
result_df.columns = ['Timestamp', 'Price', 'Amount']
|
538 |
+
|
539 |
+
# Convert price to float
|
540 |
+
result_df['Price'] = result_df['Price'].astype(float)
|
541 |
+
|
542 |
+
# Cache the result
|
543 |
+
self._price_cache[cache_key] = result_df
|
544 |
+
return result_df
|
545 |
+
|
546 |
+
except requests.exceptions.HTTPError as e:
|
547 |
+
# Handle HTTP errors more efficiently
|
548 |
+
self._error_count[error_key] = self._error_count.get(error_key, 0) + 1
|
549 |
+
|
550 |
+
# Only log the first few occurrences of each error
|
551 |
+
if self._error_count[error_key] <= 3:
|
552 |
+
logging.warning(f"HTTP error fetching price for {symbol}: {e.response.status_code}")
|
553 |
+
return None
|
554 |
+
|
555 |
+
except Exception as e:
|
556 |
+
# For other errors, use a similar approach
|
557 |
+
self._error_count[error_key] = self._error_count.get(error_key, 0) + 1
|
558 |
+
|
559 |
+
if self._error_count[error_key] <= 3:
|
560 |
+
logging.error(f"Error fetching prices for {symbol}: {str(e)}")
|
561 |
+
return None
|
562 |
+
|
563 |
+
def get_price_at_time(self,
|
564 |
+
symbol: str,
|
565 |
+
timestamp: datetime) -> Optional[float]:
|
566 |
+
"""
|
567 |
+
Get the approximate price of a token at a specific time
|
568 |
+
|
569 |
+
Args:
|
570 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
571 |
+
timestamp: Target datetime
|
572 |
+
|
573 |
+
Returns:
|
574 |
+
Price at the specified time as a float or None if not found
|
575 |
+
"""
|
576 |
+
# Look for prices 5 minutes before and after the target time
|
577 |
+
start_time = timestamp - pd.Timedelta(minutes=5)
|
578 |
+
end_time = timestamp + pd.Timedelta(minutes=5)
|
579 |
+
|
580 |
+
prices_df = self.get_historical_prices(symbol, start_time, end_time)
|
581 |
+
|
582 |
+
if prices_df is None or prices_df.empty:
|
583 |
+
return None
|
584 |
+
|
585 |
+
# Find the closest price
|
586 |
+
prices_df['time_diff'] = abs(prices_df['Timestamp'] - timestamp)
|
587 |
+
closest_price = prices_df.loc[prices_df['time_diff'].idxmin(), 'Price']
|
588 |
+
|
589 |
+
return closest_price
|
590 |
+
|
591 |
+
def get_price_impact(self,
|
592 |
+
symbol: str,
|
593 |
+
transaction_time: datetime,
|
594 |
+
lookback_minutes: int = 5,
|
595 |
+
lookahead_minutes: int = 5) -> Dict[str, Any]:
|
596 |
+
"""
|
597 |
+
Analyze the price impact before and after a transaction
|
598 |
+
|
599 |
+
Args:
|
600 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
601 |
+
transaction_time: Transaction datetime
|
602 |
+
lookback_minutes: Minutes to look back before the transaction
|
603 |
+
lookahead_minutes: Minutes to look ahead after the transaction
|
604 |
+
|
605 |
+
Returns:
|
606 |
+
Dictionary with price impact metrics
|
607 |
+
"""
|
608 |
+
start_time = transaction_time - pd.Timedelta(minutes=lookback_minutes)
|
609 |
+
end_time = transaction_time + pd.Timedelta(minutes=lookahead_minutes)
|
610 |
+
|
611 |
+
prices_df = self.get_historical_prices(symbol, start_time, end_time)
|
612 |
+
|
613 |
+
if prices_df is None or prices_df.empty:
|
614 |
+
return {
|
615 |
+
"pre_price": None,
|
616 |
+
"post_price": None,
|
617 |
+
"impact_pct": None,
|
618 |
+
"prices_df": None
|
619 |
+
}
|
620 |
+
|
621 |
+
# Find pre and post transaction prices
|
622 |
+
pre_prices = prices_df[prices_df['Timestamp'] < transaction_time]
|
623 |
+
post_prices = prices_df[prices_df['Timestamp'] >= transaction_time]
|
624 |
+
|
625 |
+
pre_price = pre_prices['Price'].iloc[-1] if not pre_prices.empty else None
|
626 |
+
post_price = post_prices['Price'].iloc[0] if not post_prices.empty else None
|
627 |
+
|
628 |
+
# Calculate impact percentage
|
629 |
+
impact_pct = None
|
630 |
+
if pre_price is not None and post_price is not None:
|
631 |
+
impact_pct = ((post_price - pre_price) / pre_price) * 100
|
632 |
+
|
633 |
+
return {
|
634 |
+
"pre_price": pre_price,
|
635 |
+
"post_price": post_price,
|
636 |
+
"impact_pct": impact_pct,
|
637 |
+
"prices_df": prices_df
|
638 |
+
}
|
639 |
+
|
640 |
+
def fetch_historical_prices(self, token_symbol: str, timestamp) -> Dict[str, Any]:
|
641 |
+
"""Fetch historical price data for a token at a specific timestamp
|
642 |
+
|
643 |
+
Args:
|
644 |
+
token_symbol: Token symbol (e.g., "ETH")
|
645 |
+
timestamp: Timestamp (can be int, float, datetime, or pandas Timestamp)
|
646 |
+
|
647 |
+
Returns:
|
648 |
+
Dictionary with price data
|
649 |
+
"""
|
650 |
+
# Convert timestamp to integer if it's not already
|
651 |
+
timestamp_value = 0
|
652 |
+
try:
|
653 |
+
# Handle different timestamp types
|
654 |
+
if isinstance(timestamp, (int, float)):
|
655 |
+
timestamp_value = int(timestamp)
|
656 |
+
elif isinstance(timestamp, pd.Timestamp):
|
657 |
+
timestamp_value = int(timestamp.timestamp())
|
658 |
+
elif isinstance(timestamp, datetime):
|
659 |
+
timestamp_value = int(timestamp.timestamp())
|
660 |
+
elif isinstance(timestamp, str):
|
661 |
+
# Try to parse string as timestamp
|
662 |
+
dt = pd.to_datetime(timestamp)
|
663 |
+
timestamp_value = int(dt.timestamp())
|
664 |
+
else:
|
665 |
+
# Default to current time if invalid type
|
666 |
+
logging.warning(f"Invalid timestamp type: {type(timestamp)}, using current time")
|
667 |
+
timestamp_value = int(time.time())
|
668 |
+
except Exception as e:
|
669 |
+
logging.warning(f"Error converting timestamp {timestamp}: {str(e)}, using current time")
|
670 |
+
timestamp_value = int(time.time())
|
671 |
+
|
672 |
+
# Check cache first
|
673 |
+
cache_key = f"{token_symbol}_{timestamp_value}"
|
674 |
+
if cache_key in self._price_cache:
|
675 |
+
return self._price_cache[cache_key]
|
676 |
+
|
677 |
+
# Implement rate limiting
|
678 |
+
current_time = time.time()
|
679 |
+
if current_time - self._last_api_call < 0.05: # 50ms minimum between calls
|
680 |
+
time.sleep(0.05)
|
681 |
+
self._last_api_call = current_time
|
682 |
+
|
683 |
+
# Check error count for this symbol
|
684 |
+
error_key = f"error_{token_symbol}"
|
685 |
+
if self._error_count.get(error_key, 0) > 10:
|
686 |
+
# Too many errors, return cached failure
|
687 |
+
return {
|
688 |
+
'symbol': token_symbol,
|
689 |
+
'timestamp': timestamp_value,
|
690 |
+
'price': None,
|
691 |
+
'status': 'error',
|
692 |
+
'error': 'Too many previous errors'
|
693 |
+
}
|
694 |
+
|
695 |
+
try:
|
696 |
+
url = f"{self.base_url}/trades/{token_symbol}USD"
|
697 |
+
params = {
|
698 |
+
'limit_trades': 500,
|
699 |
+
'timestamp': timestamp_value * 1000 # Convert to milliseconds
|
700 |
+
}
|
701 |
+
|
702 |
+
response = requests.get(url, params=params)
|
703 |
+
response.raise_for_status()
|
704 |
+
data = response.json()
|
705 |
+
|
706 |
+
# Reset error count on success
|
707 |
+
self._error_count[error_key] = 0
|
708 |
+
|
709 |
+
# Calculate average price from recent trades
|
710 |
+
if data:
|
711 |
+
prices = [float(trade['price']) for trade in data]
|
712 |
+
avg_price = sum(prices) / len(prices)
|
713 |
+
result = {
|
714 |
+
'symbol': token_symbol,
|
715 |
+
'timestamp': timestamp_value,
|
716 |
+
'price': avg_price,
|
717 |
+
'status': 'success'
|
718 |
+
}
|
719 |
+
# Cache success
|
720 |
+
self._price_cache[cache_key] = result
|
721 |
+
return result
|
722 |
+
else:
|
723 |
+
result = {
|
724 |
+
'symbol': token_symbol,
|
725 |
+
'timestamp': timestamp_value,
|
726 |
+
'price': None,
|
727 |
+
'status': 'no_data'
|
728 |
+
}
|
729 |
+
# Cache no data
|
730 |
+
self._price_cache[cache_key] = result
|
731 |
+
return result
|
732 |
+
|
733 |
+
except requests.exceptions.HTTPError as e:
|
734 |
+
# Handle HTTP errors efficiently
|
735 |
+
self._error_count[error_key] = self._error_count.get(error_key, 0) + 1
|
736 |
+
|
737 |
+
# Only log first few occurrences
|
738 |
+
if self._error_count[error_key] <= 3:
|
739 |
+
logging.warning(f"HTTP error fetching price for {token_symbol}: {e.response.status_code}")
|
740 |
+
elif self._error_count[error_key] == 10:
|
741 |
+
logging.warning(f"Suppressing further logs for {token_symbol} errors")
|
742 |
+
|
743 |
+
result = {
|
744 |
+
'symbol': token_symbol,
|
745 |
+
'timestamp': timestamp,
|
746 |
+
'price': None,
|
747 |
+
'status': 'error',
|
748 |
+
'error': f"HTTP {e.response.status_code}"
|
749 |
+
}
|
750 |
+
self._price_cache[cache_key] = result
|
751 |
+
return result
|
752 |
+
|
753 |
+
except Exception as e:
|
754 |
+
# For other errors
|
755 |
+
self._error_count[error_key] = self._error_count.get(error_key, 0) + 1
|
756 |
+
|
757 |
+
if self._error_count[error_key] <= 3:
|
758 |
+
logging.error(f"Error fetching prices for {token_symbol}: {str(e)}")
|
759 |
+
|
760 |
+
result = {
|
761 |
+
'symbol': token_symbol,
|
762 |
+
'timestamp': timestamp_value,
|
763 |
+
'price': None,
|
764 |
+
'status': 'error',
|
765 |
+
'error': str(e)
|
766 |
+
}
|
767 |
+
self._price_cache[cache_key] = result
|
768 |
+
return result
|
modules/crew_system.py
ADDED
@@ -0,0 +1,1117 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import logging
|
3 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
4 |
+
import pandas as pd
|
5 |
+
from datetime import datetime, timedelta
|
6 |
+
import io
|
7 |
+
import base64
|
8 |
+
|
9 |
+
from crewai import Agent, Task, Crew, Process
|
10 |
+
from langchain.tools import BaseTool
|
11 |
+
from langchain.chat_models import ChatOpenAI
|
12 |
+
|
13 |
+
from modules.api_client import ArbiscanClient, GeminiClient
|
14 |
+
from modules.data_processor import DataProcessor
|
15 |
+
from modules.crew_tools import (
|
16 |
+
ArbiscanGetTokenTransfersTool,
|
17 |
+
ArbiscanGetNormalTransactionsTool,
|
18 |
+
ArbiscanGetInternalTransactionsTool,
|
19 |
+
ArbiscanFetchWhaleTransactionsTool,
|
20 |
+
GeminiGetCurrentPriceTool,
|
21 |
+
GeminiGetHistoricalPricesTool,
|
22 |
+
DataProcessorIdentifyPatternsTool,
|
23 |
+
DataProcessorDetectAnomalousTransactionsTool,
|
24 |
+
set_global_clients
|
25 |
+
)
|
26 |
+
|
27 |
+
|
28 |
+
class WhaleAnalysisCrewSystem:
|
29 |
+
"""
|
30 |
+
CrewAI system for analyzing whale wallet activity and detecting market manipulation
|
31 |
+
"""
|
32 |
+
|
33 |
+
def __init__(self, arbiscan_client: ArbiscanClient, gemini_client: GeminiClient, data_processor: DataProcessor):
|
34 |
+
self.arbiscan_client = arbiscan_client
|
35 |
+
self.gemini_client = gemini_client
|
36 |
+
self.data_processor = data_processor
|
37 |
+
|
38 |
+
# Initialize LLM
|
39 |
+
try:
|
40 |
+
from langchain.chat_models import ChatOpenAI
|
41 |
+
self.llm = ChatOpenAI(
|
42 |
+
model="gpt-4",
|
43 |
+
temperature=0.2,
|
44 |
+
api_key=os.getenv("OPENAI_API_KEY")
|
45 |
+
)
|
46 |
+
except Exception as e:
|
47 |
+
logging.warning(f"Could not initialize LLM: {str(e)}")
|
48 |
+
self.llm = None
|
49 |
+
|
50 |
+
# Use a factory method to safely create tool instances
|
51 |
+
self.setup_tools()
|
52 |
+
|
53 |
+
def setup_tools(self):
|
54 |
+
"""Setup LangChain tools for the whale analysis crew"""
|
55 |
+
try:
|
56 |
+
# Setup clients
|
57 |
+
arbiscan_client = ArbiscanClient(api_key=os.getenv("ARBISCAN_API_KEY"))
|
58 |
+
gemini_client = GeminiClient(api_key=os.getenv("GEMINI_API_KEY"))
|
59 |
+
data_processor = DataProcessor()
|
60 |
+
|
61 |
+
# Set global clients first
|
62 |
+
set_global_clients(
|
63 |
+
arbiscan_client=arbiscan_client,
|
64 |
+
gemini_client=gemini_client,
|
65 |
+
data_processor=data_processor
|
66 |
+
)
|
67 |
+
|
68 |
+
# Create tools (no need to pass clients, they'll use globals)
|
69 |
+
self.arbiscan_tools = [
|
70 |
+
self._create_tool(ArbiscanGetTokenTransfersTool),
|
71 |
+
self._create_tool(ArbiscanGetNormalTransactionsTool),
|
72 |
+
self._create_tool(ArbiscanGetInternalTransactionsTool),
|
73 |
+
self._create_tool(ArbiscanFetchWhaleTransactionsTool)
|
74 |
+
]
|
75 |
+
|
76 |
+
self.gemini_tools = [
|
77 |
+
self._create_tool(GeminiGetCurrentPriceTool),
|
78 |
+
self._create_tool(GeminiGetHistoricalPricesTool)
|
79 |
+
]
|
80 |
+
|
81 |
+
self.data_processor_tools = [
|
82 |
+
self._create_tool(DataProcessorIdentifyPatternsTool),
|
83 |
+
self._create_tool(DataProcessorDetectAnomalousTransactionsTool)
|
84 |
+
]
|
85 |
+
|
86 |
+
logging.info(f"Successfully created {len(self.arbiscan_tools + self.gemini_tools + self.data_processor_tools)} tools")
|
87 |
+
|
88 |
+
except Exception as e:
|
89 |
+
logging.error(f"Error setting up tools: {str(e)}")
|
90 |
+
raise Exception(f"Error setting up tools: {str(e)}")
|
91 |
+
|
92 |
+
def _create_tool(self, tool_class, *args, **kwargs):
|
93 |
+
"""Factory method to safely create a tool with proper error handling"""
|
94 |
+
try:
|
95 |
+
tool = tool_class(*args, **kwargs)
|
96 |
+
return tool
|
97 |
+
except Exception as e:
|
98 |
+
logging.error(f"Failed to create tool {tool_class.__name__}: {str(e)}")
|
99 |
+
raise Exception(f"Failed to create tool {tool_class.__name__}: {str(e)}")
|
100 |
+
|
101 |
+
def create_agents(self):
|
102 |
+
"""Create the agents for the crew"""
|
103 |
+
|
104 |
+
# Data Collection Agent
|
105 |
+
data_collector = Agent(
|
106 |
+
role="Blockchain Data Collector",
|
107 |
+
goal="Collect comprehensive whale transaction data from the blockchain",
|
108 |
+
backstory="""You are a blockchain analytics expert specialized in extracting and
|
109 |
+
organizing on-chain data from the Arbitrum network. You have deep knowledge of blockchain
|
110 |
+
transaction structures and can efficiently query APIs to gather relevant whale activity.""",
|
111 |
+
verbose=True,
|
112 |
+
allow_delegation=True,
|
113 |
+
tools=self.arbiscan_tools,
|
114 |
+
llm=self.llm
|
115 |
+
)
|
116 |
+
|
117 |
+
# Price Analysis Agent
|
118 |
+
price_analyst = Agent(
|
119 |
+
role="Price Impact Analyst",
|
120 |
+
goal="Analyze how whale transactions impact token prices",
|
121 |
+
backstory="""You are a quantitative market analyst with expertise in correlating
|
122 |
+
trading activity with price movements. You specialize in detecting how large trades
|
123 |
+
influence market dynamics, and can identify unusual price patterns.""",
|
124 |
+
verbose=True,
|
125 |
+
allow_delegation=True,
|
126 |
+
tools=self.gemini_tools,
|
127 |
+
llm=self.llm
|
128 |
+
)
|
129 |
+
|
130 |
+
# Pattern Detection Agent
|
131 |
+
pattern_detector = Agent(
|
132 |
+
role="Trading Pattern Detector",
|
133 |
+
goal="Identify recurring behavior patterns in whale trading activity",
|
134 |
+
backstory="""You are a data scientist specialized in time-series analysis and behavioral
|
135 |
+
pattern recognition. You excel at spotting cyclical behaviors, correlation patterns, and
|
136 |
+
anomalous trading activities across multiple addresses.""",
|
137 |
+
verbose=True,
|
138 |
+
allow_delegation=True,
|
139 |
+
tools=self.data_processor_tools,
|
140 |
+
llm=self.llm
|
141 |
+
)
|
142 |
+
|
143 |
+
# Manipulation Detector Agent
|
144 |
+
manipulation_detector = Agent(
|
145 |
+
role="Market Manipulation Investigator",
|
146 |
+
goal="Detect potential market manipulation in whale activity",
|
147 |
+
backstory="""You are a financial forensics expert who has studied market manipulation
|
148 |
+
techniques for years. You can identify pump-and-dump schemes, wash trading, spoofing,
|
149 |
+
and other deceptive practices used by whale traders to manipulate market prices.""",
|
150 |
+
verbose=True,
|
151 |
+
allow_delegation=True,
|
152 |
+
tools=self.data_processor_tools,
|
153 |
+
llm=self.llm
|
154 |
+
)
|
155 |
+
|
156 |
+
# Report Generator Agent
|
157 |
+
report_generator = Agent(
|
158 |
+
role="Insights Reporter",
|
159 |
+
goal="Create comprehensive, actionable reports on whale activity",
|
160 |
+
backstory="""You are a financial data storyteller who excels at transforming complex
|
161 |
+
blockchain data into clear, insightful narratives. You can distill technical findings
|
162 |
+
into actionable intelligence for different audiences.""",
|
163 |
+
verbose=True,
|
164 |
+
allow_delegation=True,
|
165 |
+
tools=[],
|
166 |
+
llm=self.llm
|
167 |
+
)
|
168 |
+
|
169 |
+
return {
|
170 |
+
"data_collector": data_collector,
|
171 |
+
"price_analyst": price_analyst,
|
172 |
+
"pattern_detector": pattern_detector,
|
173 |
+
"manipulation_detector": manipulation_detector,
|
174 |
+
"report_generator": report_generator
|
175 |
+
}
|
176 |
+
|
177 |
+
def track_large_transactions(self,
|
178 |
+
wallets: List[str],
|
179 |
+
start_date: datetime,
|
180 |
+
end_date: datetime,
|
181 |
+
threshold_value: float,
|
182 |
+
threshold_type: str,
|
183 |
+
token_symbol: Optional[str] = None) -> pd.DataFrame:
|
184 |
+
"""
|
185 |
+
Track large buy/sell transactions for specified wallets
|
186 |
+
|
187 |
+
Args:
|
188 |
+
wallets: List of wallet addresses to track
|
189 |
+
start_date: Start date for analysis
|
190 |
+
end_date: End date for analysis
|
191 |
+
threshold_value: Minimum value for transaction tracking
|
192 |
+
threshold_type: Type of threshold ("Token Amount" or "USD Value")
|
193 |
+
token_symbol: Symbol of token to track (only required if threshold_type is "Token Amount")
|
194 |
+
|
195 |
+
Returns:
|
196 |
+
DataFrame of large transactions
|
197 |
+
"""
|
198 |
+
agents = self.create_agents()
|
199 |
+
|
200 |
+
# Define tasks
|
201 |
+
data_collection_task = Task(
|
202 |
+
description=f"""
|
203 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
204 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
205 |
+
|
206 |
+
Filter for transactions {'of ' + token_symbol if token_symbol else ''} with a
|
207 |
+
{'token amount greater than ' + str(threshold_value) if threshold_type == 'Token Amount'
|
208 |
+
else 'USD value greater than $' + str(threshold_value)}.
|
209 |
+
|
210 |
+
Return the data in a well-structured format with timestamp, transaction hash,
|
211 |
+
sender, recipient, token symbol, and amount.
|
212 |
+
""",
|
213 |
+
agent=agents["data_collector"],
|
214 |
+
expected_output="""
|
215 |
+
A comprehensive dataset of all large transactions for the specified wallets,
|
216 |
+
properly filtered according to the threshold criteria.
|
217 |
+
"""
|
218 |
+
)
|
219 |
+
|
220 |
+
# Create and run the crew
|
221 |
+
crew = Crew(
|
222 |
+
agents=[agents["data_collector"]],
|
223 |
+
tasks=[data_collection_task],
|
224 |
+
verbose=2,
|
225 |
+
process=Process.sequential
|
226 |
+
)
|
227 |
+
|
228 |
+
result = crew.kickoff()
|
229 |
+
|
230 |
+
# Process the result
|
231 |
+
import json
|
232 |
+
try:
|
233 |
+
# Try to extract JSON from the result
|
234 |
+
import re
|
235 |
+
json_match = re.search(r'```json\n([\s\S]*?)\n```', result)
|
236 |
+
|
237 |
+
if json_match:
|
238 |
+
json_str = json_match.group(1)
|
239 |
+
transactions_data = json.loads(json_str)
|
240 |
+
|
241 |
+
if isinstance(transactions_data, list):
|
242 |
+
return pd.DataFrame(transactions_data)
|
243 |
+
else:
|
244 |
+
return pd.DataFrame()
|
245 |
+
else:
|
246 |
+
# Try to parse the entire result as JSON
|
247 |
+
transactions_data = json.loads(result)
|
248 |
+
|
249 |
+
if isinstance(transactions_data, list):
|
250 |
+
return pd.DataFrame(transactions_data)
|
251 |
+
else:
|
252 |
+
return pd.DataFrame()
|
253 |
+
except:
|
254 |
+
# Fallback to querying the API directly
|
255 |
+
token_address = None # Would need a mapping of symbol to address
|
256 |
+
|
257 |
+
transactions_df = self.arbiscan_client.fetch_whale_transactions(
|
258 |
+
addresses=wallets,
|
259 |
+
token_address=token_address,
|
260 |
+
min_token_amount=threshold_value if threshold_type == "Token Amount" else None,
|
261 |
+
min_usd_value=threshold_value if threshold_type == "USD Value" else None
|
262 |
+
)
|
263 |
+
|
264 |
+
return transactions_df
|
265 |
+
|
266 |
+
def identify_trading_patterns(self,
|
267 |
+
wallets: List[str],
|
268 |
+
start_date: datetime,
|
269 |
+
end_date: datetime) -> List[Dict[str, Any]]:
|
270 |
+
"""
|
271 |
+
Identify trading patterns for specified wallets
|
272 |
+
|
273 |
+
Args:
|
274 |
+
wallets: List of wallet addresses to analyze
|
275 |
+
start_date: Start date for analysis
|
276 |
+
end_date: End date for analysis
|
277 |
+
|
278 |
+
Returns:
|
279 |
+
List of identified patterns
|
280 |
+
"""
|
281 |
+
agents = self.create_agents()
|
282 |
+
|
283 |
+
# Define tasks
|
284 |
+
data_collection_task = Task(
|
285 |
+
description=f"""
|
286 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
287 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
288 |
+
|
289 |
+
Include all token transfers, regardless of size.
|
290 |
+
""",
|
291 |
+
agent=agents["data_collector"],
|
292 |
+
expected_output="""
|
293 |
+
A comprehensive dataset of all transactions for the specified wallets.
|
294 |
+
"""
|
295 |
+
)
|
296 |
+
|
297 |
+
pattern_analysis_task = Task(
|
298 |
+
description="""
|
299 |
+
Analyze the transaction data to identify recurring trading patterns.
|
300 |
+
Look for:
|
301 |
+
1. Cyclical buying/selling behaviors
|
302 |
+
2. Time-of-day patterns
|
303 |
+
3. Accumulation/distribution phases
|
304 |
+
4. Coordinated movements across multiple addresses
|
305 |
+
|
306 |
+
Cluster similar behaviors and describe each pattern identified.
|
307 |
+
""",
|
308 |
+
agent=agents["pattern_detector"],
|
309 |
+
expected_output="""
|
310 |
+
A detailed analysis of trading patterns with:
|
311 |
+
- Pattern name/type
|
312 |
+
- Description of behavior
|
313 |
+
- Frequency and confidence level
|
314 |
+
- Example transactions showing the pattern
|
315 |
+
""",
|
316 |
+
context=[data_collection_task]
|
317 |
+
)
|
318 |
+
|
319 |
+
# Create and run the crew
|
320 |
+
crew = Crew(
|
321 |
+
agents=[agents["data_collector"], agents["pattern_detector"]],
|
322 |
+
tasks=[data_collection_task, pattern_analysis_task],
|
323 |
+
verbose=2,
|
324 |
+
process=Process.sequential
|
325 |
+
)
|
326 |
+
|
327 |
+
result = crew.kickoff()
|
328 |
+
|
329 |
+
# Process the result
|
330 |
+
import json
|
331 |
+
try:
|
332 |
+
# Try to extract JSON from the result
|
333 |
+
import re
|
334 |
+
json_match = re.search(r'```json\n([\s\S]*?)\n```', result)
|
335 |
+
|
336 |
+
if json_match:
|
337 |
+
json_str = json_match.group(1)
|
338 |
+
patterns_data = json.loads(json_str)
|
339 |
+
|
340 |
+
# Convert the patterns to the expected format
|
341 |
+
return self._convert_patterns_to_visual_format(patterns_data)
|
342 |
+
else:
|
343 |
+
# Fallback to a simple pattern analysis
|
344 |
+
# First, get transaction data directly
|
345 |
+
all_transactions = []
|
346 |
+
|
347 |
+
for wallet in wallets:
|
348 |
+
transfers = self.arbiscan_client.fetch_all_token_transfers(
|
349 |
+
address=wallet
|
350 |
+
)
|
351 |
+
all_transactions.extend(transfers)
|
352 |
+
|
353 |
+
if not all_transactions:
|
354 |
+
return []
|
355 |
+
|
356 |
+
transactions_df = pd.DataFrame(all_transactions)
|
357 |
+
|
358 |
+
# Use data processor to identify patterns
|
359 |
+
patterns = self.data_processor.identify_patterns(transactions_df)
|
360 |
+
|
361 |
+
return patterns
|
362 |
+
except Exception as e:
|
363 |
+
print(f"Error processing patterns: {str(e)}")
|
364 |
+
return []
|
365 |
+
|
366 |
+
def _convert_patterns_to_visual_format(self, patterns_data: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
367 |
+
"""
|
368 |
+
Convert pattern data from agents to visual format with charts
|
369 |
+
|
370 |
+
Args:
|
371 |
+
patterns_data: Pattern data from agents
|
372 |
+
|
373 |
+
Returns:
|
374 |
+
List of patterns with visualizations
|
375 |
+
"""
|
376 |
+
visual_patterns = []
|
377 |
+
|
378 |
+
for pattern in patterns_data:
|
379 |
+
# Create chart
|
380 |
+
if 'examples' in pattern and pattern['examples']:
|
381 |
+
examples_data = []
|
382 |
+
|
383 |
+
# Check if examples is a JSON string
|
384 |
+
if isinstance(pattern['examples'], str):
|
385 |
+
try:
|
386 |
+
examples_data = pd.read_json(pattern['examples'])
|
387 |
+
except:
|
388 |
+
examples_data = pd.DataFrame()
|
389 |
+
else:
|
390 |
+
examples_data = pd.DataFrame(pattern['examples'])
|
391 |
+
|
392 |
+
# Create visualization
|
393 |
+
if not examples_data.empty:
|
394 |
+
import plotly.express as px
|
395 |
+
|
396 |
+
# Check for timestamp column
|
397 |
+
if 'Timestamp' in examples_data.columns:
|
398 |
+
time_col = 'Timestamp'
|
399 |
+
elif 'timeStamp' in examples_data.columns:
|
400 |
+
time_col = 'timeStamp'
|
401 |
+
else:
|
402 |
+
time_col = None
|
403 |
+
|
404 |
+
# Check for amount column
|
405 |
+
if 'Amount' in examples_data.columns:
|
406 |
+
amount_col = 'Amount'
|
407 |
+
elif 'tokenAmount' in examples_data.columns:
|
408 |
+
amount_col = 'tokenAmount'
|
409 |
+
elif 'value' in examples_data.columns:
|
410 |
+
amount_col = 'value'
|
411 |
+
else:
|
412 |
+
amount_col = None
|
413 |
+
|
414 |
+
if time_col and amount_col:
|
415 |
+
# Create time series chart
|
416 |
+
fig = px.line(
|
417 |
+
examples_data,
|
418 |
+
x=time_col,
|
419 |
+
y=amount_col,
|
420 |
+
title=f"Pattern: {pattern['name']}"
|
421 |
+
)
|
422 |
+
else:
|
423 |
+
fig = None
|
424 |
+
else:
|
425 |
+
fig = None
|
426 |
+
else:
|
427 |
+
fig = None
|
428 |
+
examples_data = pd.DataFrame()
|
429 |
+
|
430 |
+
# Create visual pattern object
|
431 |
+
visual_pattern = {
|
432 |
+
"name": pattern.get("name", "Unknown Pattern"),
|
433 |
+
"description": pattern.get("description", ""),
|
434 |
+
"confidence": pattern.get("confidence", 0.5),
|
435 |
+
"occurrence_count": pattern.get("occurrence_count", 0),
|
436 |
+
"chart_data": fig,
|
437 |
+
"examples": examples_data
|
438 |
+
}
|
439 |
+
|
440 |
+
visual_patterns.append(visual_pattern)
|
441 |
+
|
442 |
+
return visual_patterns
|
443 |
+
|
444 |
+
def analyze_price_impact(self,
|
445 |
+
wallets: List[str],
|
446 |
+
start_date: datetime,
|
447 |
+
end_date: datetime,
|
448 |
+
lookback_minutes: int = 5,
|
449 |
+
lookahead_minutes: int = 5) -> Dict[str, Any]:
|
450 |
+
"""
|
451 |
+
Analyze the impact of whale transactions on token prices
|
452 |
+
|
453 |
+
Args:
|
454 |
+
wallets: List of wallet addresses to analyze
|
455 |
+
start_date: Start date for analysis
|
456 |
+
end_date: End date for analysis
|
457 |
+
lookback_minutes: Minutes to look back before transactions
|
458 |
+
lookahead_minutes: Minutes to look ahead after transactions
|
459 |
+
|
460 |
+
Returns:
|
461 |
+
Dictionary with price impact analysis
|
462 |
+
"""
|
463 |
+
agents = self.create_agents()
|
464 |
+
|
465 |
+
# Define tasks
|
466 |
+
data_collection_task = Task(
|
467 |
+
description=f"""
|
468 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
469 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
470 |
+
|
471 |
+
Focus on large transactions that might impact price.
|
472 |
+
""",
|
473 |
+
agent=agents["data_collector"],
|
474 |
+
expected_output="""
|
475 |
+
A comprehensive dataset of all significant transactions for the specified wallets.
|
476 |
+
"""
|
477 |
+
)
|
478 |
+
|
479 |
+
price_impact_task = Task(
|
480 |
+
description=f"""
|
481 |
+
Analyze the price impact of the whale transactions.
|
482 |
+
For each transaction:
|
483 |
+
1. Fetch price data for {lookback_minutes} minutes before and {lookahead_minutes} minutes after the transaction
|
484 |
+
2. Calculate the percentage price change
|
485 |
+
3. Identify transactions that caused significant price moves
|
486 |
+
|
487 |
+
Summarize the overall price impact statistics and highlight notable instances.
|
488 |
+
""",
|
489 |
+
agent=agents["price_analyst"],
|
490 |
+
expected_output="""
|
491 |
+
A detailed analysis of price impacts with:
|
492 |
+
- Average price impact percentage
|
493 |
+
- Maximum price impact (positive and negative)
|
494 |
+
- Count of significant price moves
|
495 |
+
- List of transactions with their corresponding price impacts
|
496 |
+
""",
|
497 |
+
context=[data_collection_task]
|
498 |
+
)
|
499 |
+
|
500 |
+
# Create and run the crew
|
501 |
+
crew = Crew(
|
502 |
+
agents=[agents["data_collector"], agents["price_analyst"]],
|
503 |
+
tasks=[data_collection_task, price_impact_task],
|
504 |
+
verbose=2,
|
505 |
+
process=Process.sequential
|
506 |
+
)
|
507 |
+
|
508 |
+
result = crew.kickoff()
|
509 |
+
|
510 |
+
# Process the result
|
511 |
+
import json
|
512 |
+
try:
|
513 |
+
# Try to extract JSON from the result
|
514 |
+
import re
|
515 |
+
json_match = re.search(r'```json\n([\s\S]*?)\n```', result)
|
516 |
+
|
517 |
+
if json_match:
|
518 |
+
json_str = json_match.group(1)
|
519 |
+
impact_data = json.loads(json_str)
|
520 |
+
|
521 |
+
# Convert the impact data to visual format
|
522 |
+
return self._convert_impact_to_visual_format(impact_data)
|
523 |
+
else:
|
524 |
+
# Fallback to direct calculation
|
525 |
+
# First, get transaction data
|
526 |
+
all_transactions = []
|
527 |
+
|
528 |
+
for wallet in wallets:
|
529 |
+
transfers = self.arbiscan_client.fetch_all_token_transfers(
|
530 |
+
address=wallet
|
531 |
+
)
|
532 |
+
all_transactions.extend(transfers)
|
533 |
+
|
534 |
+
if not all_transactions:
|
535 |
+
return {}
|
536 |
+
|
537 |
+
transactions_df = pd.DataFrame(all_transactions)
|
538 |
+
|
539 |
+
# Calculate price impact for each transaction
|
540 |
+
price_data = {}
|
541 |
+
|
542 |
+
for idx, row in transactions_df.iterrows():
|
543 |
+
tx_hash = row.get('hash', '')
|
544 |
+
|
545 |
+
if not tx_hash:
|
546 |
+
continue
|
547 |
+
|
548 |
+
# Get symbol
|
549 |
+
symbol = row.get('tokenSymbol', '')
|
550 |
+
if not symbol:
|
551 |
+
continue
|
552 |
+
|
553 |
+
# Get timestamp
|
554 |
+
timestamp = row.get('timeStamp', 0)
|
555 |
+
if not timestamp:
|
556 |
+
continue
|
557 |
+
|
558 |
+
# Convert timestamp to datetime
|
559 |
+
if isinstance(timestamp, (int, float)):
|
560 |
+
tx_time = datetime.fromtimestamp(int(timestamp))
|
561 |
+
else:
|
562 |
+
tx_time = timestamp
|
563 |
+
|
564 |
+
# Get price impact
|
565 |
+
symbol_usd = f"{symbol}USD"
|
566 |
+
impact = self.gemini_client.get_price_impact(
|
567 |
+
symbol=symbol_usd,
|
568 |
+
transaction_time=tx_time,
|
569 |
+
lookback_minutes=lookback_minutes,
|
570 |
+
lookahead_minutes=lookahead_minutes
|
571 |
+
)
|
572 |
+
|
573 |
+
price_data[tx_hash] = impact
|
574 |
+
|
575 |
+
# Use data processor to analyze price impact
|
576 |
+
impact_analysis = self.data_processor.analyze_price_impact(
|
577 |
+
transactions_df=transactions_df,
|
578 |
+
price_data=price_data
|
579 |
+
)
|
580 |
+
|
581 |
+
return impact_analysis
|
582 |
+
except Exception as e:
|
583 |
+
print(f"Error processing price impact: {str(e)}")
|
584 |
+
return {}
|
585 |
+
|
586 |
+
def _convert_impact_to_visual_format(self, impact_data: Dict[str, Any]) -> Dict[str, Any]:
|
587 |
+
"""
|
588 |
+
Convert price impact data to visual format with charts
|
589 |
+
|
590 |
+
Args:
|
591 |
+
impact_data: Price impact data
|
592 |
+
|
593 |
+
Returns:
|
594 |
+
Dictionary with price impact analysis and visualizations
|
595 |
+
"""
|
596 |
+
# Convert transactions_with_impact to DataFrame if it's a string
|
597 |
+
if 'transactions_with_impact' in impact_data and isinstance(impact_data['transactions_with_impact'], str):
|
598 |
+
try:
|
599 |
+
transactions_df = pd.read_json(impact_data['transactions_with_impact'])
|
600 |
+
except:
|
601 |
+
transactions_df = pd.DataFrame()
|
602 |
+
elif 'transactions_with_impact' in impact_data and isinstance(impact_data['transactions_with_impact'], list):
|
603 |
+
transactions_df = pd.DataFrame(impact_data['transactions_with_impact'])
|
604 |
+
else:
|
605 |
+
transactions_df = pd.DataFrame()
|
606 |
+
|
607 |
+
# Create impact chart
|
608 |
+
if not transactions_df.empty and 'impact_pct' in transactions_df.columns and 'Timestamp' in transactions_df.columns:
|
609 |
+
import plotly.graph_objects as go
|
610 |
+
|
611 |
+
fig = go.Figure()
|
612 |
+
|
613 |
+
fig.add_trace(go.Scatter(
|
614 |
+
x=transactions_df['Timestamp'],
|
615 |
+
y=transactions_df['impact_pct'],
|
616 |
+
mode='markers+lines',
|
617 |
+
name='Price Impact (%)',
|
618 |
+
marker=dict(
|
619 |
+
size=10,
|
620 |
+
color=transactions_df['impact_pct'],
|
621 |
+
colorscale='RdBu',
|
622 |
+
cmin=-max(abs(transactions_df['impact_pct'])) if len(transactions_df) > 0 else -1,
|
623 |
+
cmax=max(abs(transactions_df['impact_pct'])) if len(transactions_df) > 0 else 1,
|
624 |
+
colorbar=dict(title='Impact %'),
|
625 |
+
symbol='circle'
|
626 |
+
)
|
627 |
+
))
|
628 |
+
|
629 |
+
fig.update_layout(
|
630 |
+
title='Price Impact of Whale Transactions',
|
631 |
+
xaxis_title='Timestamp',
|
632 |
+
yaxis_title='Price Impact (%)',
|
633 |
+
hovermode='closest'
|
634 |
+
)
|
635 |
+
|
636 |
+
# Add zero line
|
637 |
+
fig.add_hline(y=0, line_dash="dash", line_color="gray")
|
638 |
+
else:
|
639 |
+
fig = None
|
640 |
+
|
641 |
+
# Create visual impact analysis
|
642 |
+
visual_impact = {
|
643 |
+
'avg_impact_pct': impact_data.get('avg_impact_pct', 0),
|
644 |
+
'max_impact_pct': impact_data.get('max_impact_pct', 0),
|
645 |
+
'min_impact_pct': impact_data.get('min_impact_pct', 0),
|
646 |
+
'significant_moves_count': impact_data.get('significant_moves_count', 0),
|
647 |
+
'total_transactions': impact_data.get('total_transactions', 0),
|
648 |
+
'impact_chart': fig,
|
649 |
+
'transactions_with_impact': transactions_df
|
650 |
+
}
|
651 |
+
|
652 |
+
return visual_impact
|
653 |
+
|
654 |
+
def detect_manipulation(self,
|
655 |
+
wallets: List[str],
|
656 |
+
start_date: datetime,
|
657 |
+
end_date: datetime,
|
658 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
659 |
+
"""
|
660 |
+
Detect potential market manipulation by whale wallets
|
661 |
+
|
662 |
+
Args:
|
663 |
+
wallets: List of wallet addresses to analyze
|
664 |
+
start_date: Start date for analysis
|
665 |
+
end_date: End date for analysis
|
666 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
667 |
+
|
668 |
+
Returns:
|
669 |
+
List of manipulation alerts
|
670 |
+
"""
|
671 |
+
agents = self.create_agents()
|
672 |
+
|
673 |
+
# Define tasks
|
674 |
+
data_collection_task = Task(
|
675 |
+
description=f"""
|
676 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
677 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
678 |
+
|
679 |
+
Include all token transfers and also fetch price data if available.
|
680 |
+
""",
|
681 |
+
agent=agents["data_collector"],
|
682 |
+
expected_output="""
|
683 |
+
A comprehensive dataset of all transactions for the specified wallets.
|
684 |
+
"""
|
685 |
+
)
|
686 |
+
|
687 |
+
price_impact_task = Task(
|
688 |
+
description="""
|
689 |
+
Analyze the price impact of the whale transactions.
|
690 |
+
For each significant transaction, fetch and analyze price data around the transaction time.
|
691 |
+
""",
|
692 |
+
agent=agents["price_analyst"],
|
693 |
+
expected_output="""
|
694 |
+
Price impact data for the transactions.
|
695 |
+
""",
|
696 |
+
context=[data_collection_task]
|
697 |
+
)
|
698 |
+
|
699 |
+
manipulation_detection_task = Task(
|
700 |
+
description=f"""
|
701 |
+
Detect potential market manipulation patterns in the transaction data with sensitivity level: {sensitivity}.
|
702 |
+
Look for:
|
703 |
+
1. Pump-and-Dump: Rapid buys followed by coordinated sell-offs
|
704 |
+
2. Wash Trading: Self-trading across multiple addresses
|
705 |
+
3. Spoofing: Large orders placed then canceled (if detectable)
|
706 |
+
4. Momentum Ignition: Creating sharp price moves to trigger other participants' momentum-based trading
|
707 |
+
|
708 |
+
For each potential manipulation, provide:
|
709 |
+
- Type of manipulation
|
710 |
+
- Involved addresses
|
711 |
+
- Risk level (High, Medium, Low)
|
712 |
+
- Description of the suspicious behavior
|
713 |
+
- Evidence (transactions showing the pattern)
|
714 |
+
""",
|
715 |
+
agent=agents["manipulation_detector"],
|
716 |
+
expected_output="""
|
717 |
+
A detailed list of potential manipulation incidents with supporting evidence.
|
718 |
+
""",
|
719 |
+
context=[data_collection_task, price_impact_task]
|
720 |
+
)
|
721 |
+
|
722 |
+
# Create and run the crew
|
723 |
+
crew = Crew(
|
724 |
+
agents=[
|
725 |
+
agents["data_collector"],
|
726 |
+
agents["price_analyst"],
|
727 |
+
agents["manipulation_detector"]
|
728 |
+
],
|
729 |
+
tasks=[
|
730 |
+
data_collection_task,
|
731 |
+
price_impact_task,
|
732 |
+
manipulation_detection_task
|
733 |
+
],
|
734 |
+
verbose=2,
|
735 |
+
process=Process.sequential
|
736 |
+
)
|
737 |
+
|
738 |
+
result = crew.kickoff()
|
739 |
+
|
740 |
+
# Process the result
|
741 |
+
import json
|
742 |
+
try:
|
743 |
+
# Try to extract JSON from the result
|
744 |
+
import re
|
745 |
+
json_match = re.search(r'```json\n([\s\S]*?)\n```', result)
|
746 |
+
|
747 |
+
if json_match:
|
748 |
+
json_str = json_match.group(1)
|
749 |
+
alerts_data = json.loads(json_str)
|
750 |
+
|
751 |
+
# Convert the alerts to visual format
|
752 |
+
return self._convert_alerts_to_visual_format(alerts_data)
|
753 |
+
else:
|
754 |
+
# Fallback to direct detection
|
755 |
+
# First, get transaction data
|
756 |
+
all_transactions = []
|
757 |
+
|
758 |
+
for wallet in wallets:
|
759 |
+
transfers = self.arbiscan_client.fetch_all_token_transfers(
|
760 |
+
address=wallet
|
761 |
+
)
|
762 |
+
all_transactions.extend(transfers)
|
763 |
+
|
764 |
+
if not all_transactions:
|
765 |
+
return []
|
766 |
+
|
767 |
+
transactions_df = pd.DataFrame(all_transactions)
|
768 |
+
|
769 |
+
# Calculate price impact for each transaction
|
770 |
+
price_data = {}
|
771 |
+
|
772 |
+
for idx, row in transactions_df.iterrows():
|
773 |
+
tx_hash = row.get('hash', '')
|
774 |
+
|
775 |
+
if not tx_hash:
|
776 |
+
continue
|
777 |
+
|
778 |
+
# Get symbol
|
779 |
+
symbol = row.get('tokenSymbol', '')
|
780 |
+
if not symbol:
|
781 |
+
continue
|
782 |
+
|
783 |
+
# Get timestamp
|
784 |
+
timestamp = row.get('timeStamp', 0)
|
785 |
+
if not timestamp:
|
786 |
+
continue
|
787 |
+
|
788 |
+
# Convert timestamp to datetime
|
789 |
+
if isinstance(timestamp, (int, float)):
|
790 |
+
tx_time = datetime.fromtimestamp(int(timestamp))
|
791 |
+
else:
|
792 |
+
tx_time = timestamp
|
793 |
+
|
794 |
+
# Get price impact
|
795 |
+
symbol_usd = f"{symbol}USD"
|
796 |
+
impact = self.gemini_client.get_price_impact(
|
797 |
+
symbol=symbol_usd,
|
798 |
+
transaction_time=tx_time,
|
799 |
+
lookback_minutes=5,
|
800 |
+
lookahead_minutes=5
|
801 |
+
)
|
802 |
+
|
803 |
+
price_data[tx_hash] = impact
|
804 |
+
|
805 |
+
# Detect wash trading
|
806 |
+
wash_trading_alerts = self.data_processor.detect_wash_trading(
|
807 |
+
transactions_df=transactions_df,
|
808 |
+
addresses=wallets,
|
809 |
+
sensitivity=sensitivity
|
810 |
+
)
|
811 |
+
|
812 |
+
# Detect pump and dump
|
813 |
+
pump_and_dump_alerts = self.data_processor.detect_pump_and_dump(
|
814 |
+
transactions_df=transactions_df,
|
815 |
+
price_data=price_data,
|
816 |
+
sensitivity=sensitivity
|
817 |
+
)
|
818 |
+
|
819 |
+
# Combine alerts
|
820 |
+
all_alerts = wash_trading_alerts + pump_and_dump_alerts
|
821 |
+
|
822 |
+
return all_alerts
|
823 |
+
except Exception as e:
|
824 |
+
print(f"Error detecting manipulation: {str(e)}")
|
825 |
+
return []
|
826 |
+
|
827 |
+
def _convert_alerts_to_visual_format(self, alerts_data: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
828 |
+
"""
|
829 |
+
Convert manipulation alerts data to visual format with charts
|
830 |
+
|
831 |
+
Args:
|
832 |
+
alerts_data: Alerts data from agents
|
833 |
+
|
834 |
+
Returns:
|
835 |
+
List of alerts with visualizations
|
836 |
+
"""
|
837 |
+
visual_alerts = []
|
838 |
+
|
839 |
+
for alert in alerts_data:
|
840 |
+
# Create chart based on alert type
|
841 |
+
if 'evidence' in alert and alert['evidence']:
|
842 |
+
evidence_data = []
|
843 |
+
|
844 |
+
# Check if evidence is a JSON string
|
845 |
+
if isinstance(alert['evidence'], str):
|
846 |
+
try:
|
847 |
+
evidence_data = pd.read_json(alert['evidence'])
|
848 |
+
except:
|
849 |
+
evidence_data = pd.DataFrame()
|
850 |
+
else:
|
851 |
+
evidence_data = pd.DataFrame(alert['evidence'])
|
852 |
+
|
853 |
+
# Create visualization based on alert type
|
854 |
+
if not evidence_data.empty:
|
855 |
+
import plotly.graph_objects as go
|
856 |
+
import plotly.express as px
|
857 |
+
|
858 |
+
# Check for timestamp column
|
859 |
+
if 'Timestamp' in evidence_data.columns:
|
860 |
+
time_col = 'Timestamp'
|
861 |
+
elif 'timeStamp' in evidence_data.columns:
|
862 |
+
time_col = 'timeStamp'
|
863 |
+
elif 'timestamp' in evidence_data.columns:
|
864 |
+
time_col = 'timestamp'
|
865 |
+
else:
|
866 |
+
time_col = None
|
867 |
+
|
868 |
+
# Different visualizations based on alert type
|
869 |
+
if alert.get('type') == 'Wash Trading' and time_col:
|
870 |
+
# Create scatter plot of wash trading
|
871 |
+
fig = px.scatter(
|
872 |
+
evidence_data,
|
873 |
+
x=time_col,
|
874 |
+
y=evidence_data.get('Amount', evidence_data.get('tokenAmount', evidence_data.get('value', 0))),
|
875 |
+
color=evidence_data.get('From', evidence_data.get('from', 'Unknown')),
|
876 |
+
title=f"Wash Trading Evidence: {alert.get('title', '')}"
|
877 |
+
)
|
878 |
+
elif alert.get('type') == 'Pump and Dump' and time_col and 'pre_price' in evidence_data.columns:
|
879 |
+
# Create price line for pump and dump
|
880 |
+
fig = go.Figure()
|
881 |
+
|
882 |
+
# Plot price line
|
883 |
+
fig.add_trace(go.Scatter(
|
884 |
+
x=evidence_data[time_col],
|
885 |
+
y=evidence_data['pre_price'],
|
886 |
+
mode='lines+markers',
|
887 |
+
name='Price Before Transaction',
|
888 |
+
line=dict(color='blue')
|
889 |
+
))
|
890 |
+
|
891 |
+
fig.add_trace(go.Scatter(
|
892 |
+
x=evidence_data[time_col],
|
893 |
+
y=evidence_data['post_price'],
|
894 |
+
mode='lines+markers',
|
895 |
+
name='Price After Transaction',
|
896 |
+
line=dict(color='red')
|
897 |
+
))
|
898 |
+
|
899 |
+
fig.update_layout(
|
900 |
+
title=f"Pump and Dump Evidence: {alert.get('title', '')}",
|
901 |
+
xaxis_title='Time',
|
902 |
+
yaxis_title='Price',
|
903 |
+
hovermode='closest'
|
904 |
+
)
|
905 |
+
elif alert.get('type') == 'Momentum Ignition' and time_col and 'impact_pct' in evidence_data.columns:
|
906 |
+
# Create impact scatter for momentum ignition
|
907 |
+
fig = px.scatter(
|
908 |
+
evidence_data,
|
909 |
+
x=time_col,
|
910 |
+
y='impact_pct',
|
911 |
+
size=abs(evidence_data['impact_pct']),
|
912 |
+
color='impact_pct',
|
913 |
+
color_continuous_scale='RdBu',
|
914 |
+
title=f"Momentum Ignition Evidence: {alert.get('title', '')}"
|
915 |
+
)
|
916 |
+
else:
|
917 |
+
# Generic timeline view
|
918 |
+
if time_col:
|
919 |
+
fig = px.timeline(
|
920 |
+
evidence_data,
|
921 |
+
x_start=time_col,
|
922 |
+
x_end=time_col,
|
923 |
+
y=evidence_data.get('From', evidence_data.get('from', 'Unknown')),
|
924 |
+
color=alert.get('risk_level', 'Medium'),
|
925 |
+
title=f"Alert Evidence: {alert.get('title', '')}"
|
926 |
+
)
|
927 |
+
else:
|
928 |
+
fig = None
|
929 |
+
else:
|
930 |
+
fig = None
|
931 |
+
else:
|
932 |
+
fig = None
|
933 |
+
evidence_data = pd.DataFrame()
|
934 |
+
|
935 |
+
# Create visual alert object
|
936 |
+
visual_alert = {
|
937 |
+
"type": alert.get("type", "Unknown"),
|
938 |
+
"addresses": alert.get("addresses", []),
|
939 |
+
"risk_level": alert.get("risk_level", "Medium"),
|
940 |
+
"description": alert.get("description", ""),
|
941 |
+
"detection_time": alert.get("detection_time", datetime.now().strftime("%Y-%m-%d %H:%M:%S")),
|
942 |
+
"title": alert.get("title", "Alert"),
|
943 |
+
"evidence": evidence_data,
|
944 |
+
"chart": fig
|
945 |
+
}
|
946 |
+
|
947 |
+
visual_alerts.append(visual_alert)
|
948 |
+
|
949 |
+
return visual_alerts
|
950 |
+
|
951 |
+
def generate_report(self,
|
952 |
+
wallets: List[str],
|
953 |
+
start_date: datetime,
|
954 |
+
end_date: datetime,
|
955 |
+
report_type: str = "Transaction Summary",
|
956 |
+
export_format: str = "PDF") -> Dict[str, Any]:
|
957 |
+
"""
|
958 |
+
Generate a report of whale activity
|
959 |
+
|
960 |
+
Args:
|
961 |
+
wallets: List of wallet addresses to include in the report
|
962 |
+
start_date: Start date for report period
|
963 |
+
end_date: End date for report period
|
964 |
+
report_type: Type of report to generate
|
965 |
+
export_format: Format for the report (CSV, PDF, PNG)
|
966 |
+
|
967 |
+
Returns:
|
968 |
+
Dictionary with report data
|
969 |
+
"""
|
970 |
+
from modules.visualizer import Visualizer
|
971 |
+
visualizer = Visualizer()
|
972 |
+
|
973 |
+
agents = self.create_agents()
|
974 |
+
|
975 |
+
# Define tasks
|
976 |
+
data_collection_task = Task(
|
977 |
+
description=f"""
|
978 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
979 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
980 |
+
""",
|
981 |
+
agent=agents["data_collector"],
|
982 |
+
expected_output="""
|
983 |
+
A comprehensive dataset of all transactions for the specified wallets.
|
984 |
+
"""
|
985 |
+
)
|
986 |
+
|
987 |
+
report_task = Task(
|
988 |
+
description=f"""
|
989 |
+
Generate a {report_type} report in {export_format} format.
|
990 |
+
The report should include:
|
991 |
+
1. Executive summary of wallet activity
|
992 |
+
2. Transaction analysis
|
993 |
+
3. Pattern identification (if applicable)
|
994 |
+
4. Price impact analysis (if applicable)
|
995 |
+
5. Manipulation detection (if applicable)
|
996 |
+
|
997 |
+
Organize the information clearly and provide actionable insights.
|
998 |
+
""",
|
999 |
+
agent=agents["report_generator"],
|
1000 |
+
expected_output=f"""
|
1001 |
+
A complete {export_format} report with all relevant analyses.
|
1002 |
+
""",
|
1003 |
+
context=[data_collection_task]
|
1004 |
+
)
|
1005 |
+
|
1006 |
+
# Create and run the crew
|
1007 |
+
crew = Crew(
|
1008 |
+
agents=[agents["data_collector"], agents["report_generator"]],
|
1009 |
+
tasks=[data_collection_task, report_task],
|
1010 |
+
verbose=2,
|
1011 |
+
process=Process.sequential
|
1012 |
+
)
|
1013 |
+
|
1014 |
+
result = crew.kickoff()
|
1015 |
+
|
1016 |
+
# Process the result - for reports, we'll use our visualizer directly
|
1017 |
+
# First, get transaction data
|
1018 |
+
all_transactions = []
|
1019 |
+
|
1020 |
+
for wallet in wallets:
|
1021 |
+
transfers = self.arbiscan_client.fetch_all_token_transfers(
|
1022 |
+
address=wallet
|
1023 |
+
)
|
1024 |
+
all_transactions.extend(transfers)
|
1025 |
+
|
1026 |
+
if not all_transactions:
|
1027 |
+
return {
|
1028 |
+
"filename": f"no_data_report_{datetime.now().strftime('%Y%m%d_%H%M%S')}.{export_format.lower()}",
|
1029 |
+
"content": ""
|
1030 |
+
}
|
1031 |
+
|
1032 |
+
transactions_df = pd.DataFrame(all_transactions)
|
1033 |
+
|
1034 |
+
# Generate the report based on format
|
1035 |
+
filename = f"whale_report_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
|
1036 |
+
|
1037 |
+
if export_format == "CSV":
|
1038 |
+
content = visualizer.generate_csv_report(
|
1039 |
+
transactions_df=transactions_df,
|
1040 |
+
report_type=report_type
|
1041 |
+
)
|
1042 |
+
filename += ".csv"
|
1043 |
+
|
1044 |
+
return {
|
1045 |
+
"filename": filename,
|
1046 |
+
"content": content
|
1047 |
+
}
|
1048 |
+
|
1049 |
+
elif export_format == "PDF":
|
1050 |
+
# For PDF we need to get more data
|
1051 |
+
# Run pattern detection
|
1052 |
+
patterns = self.identify_trading_patterns(
|
1053 |
+
wallets=wallets,
|
1054 |
+
start_date=start_date,
|
1055 |
+
end_date=end_date
|
1056 |
+
)
|
1057 |
+
|
1058 |
+
# Run price impact analysis
|
1059 |
+
price_impact = self.analyze_price_impact(
|
1060 |
+
wallets=wallets,
|
1061 |
+
start_date=start_date,
|
1062 |
+
end_date=end_date
|
1063 |
+
)
|
1064 |
+
|
1065 |
+
# Run manipulation detection
|
1066 |
+
alerts = self.detect_manipulation(
|
1067 |
+
wallets=wallets,
|
1068 |
+
start_date=start_date,
|
1069 |
+
end_date=end_date
|
1070 |
+
)
|
1071 |
+
|
1072 |
+
content = visualizer.generate_pdf_report(
|
1073 |
+
transactions_df=transactions_df,
|
1074 |
+
patterns=patterns,
|
1075 |
+
price_impact=price_impact,
|
1076 |
+
alerts=alerts,
|
1077 |
+
title=f"Whale Analysis Report: {report_type}",
|
1078 |
+
start_date=start_date,
|
1079 |
+
end_date=end_date
|
1080 |
+
)
|
1081 |
+
filename += ".pdf"
|
1082 |
+
|
1083 |
+
return {
|
1084 |
+
"filename": filename,
|
1085 |
+
"content": content
|
1086 |
+
}
|
1087 |
+
|
1088 |
+
elif export_format == "PNG":
|
1089 |
+
# For PNG we'll create a chart based on report type
|
1090 |
+
if report_type == "Transaction Summary":
|
1091 |
+
fig = visualizer.create_transaction_timeline(transactions_df)
|
1092 |
+
elif report_type == "Pattern Analysis":
|
1093 |
+
fig = visualizer.create_volume_chart(transactions_df)
|
1094 |
+
elif report_type == "Price Impact":
|
1095 |
+
# Run price impact analysis first
|
1096 |
+
price_impact = self.analyze_price_impact(
|
1097 |
+
wallets=wallets,
|
1098 |
+
start_date=start_date,
|
1099 |
+
end_date=end_date
|
1100 |
+
)
|
1101 |
+
fig = price_impact.get('impact_chart', visualizer.create_transaction_timeline(transactions_df))
|
1102 |
+
else: # "Manipulation Detection" or "Complete Analysis"
|
1103 |
+
fig = visualizer.create_network_graph(transactions_df)
|
1104 |
+
|
1105 |
+
content = visualizer.generate_png_chart(fig)
|
1106 |
+
filename += ".png"
|
1107 |
+
|
1108 |
+
return {
|
1109 |
+
"filename": filename,
|
1110 |
+
"content": content
|
1111 |
+
}
|
1112 |
+
|
1113 |
+
else:
|
1114 |
+
return {
|
1115 |
+
"filename": f"unsupported_format_{datetime.now().strftime('%Y%m%d_%H%M%S')}.txt",
|
1116 |
+
"content": "Unsupported export format requested."
|
1117 |
+
}
|
modules/crew_tools.py
ADDED
@@ -0,0 +1,362 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
"""
|
2 |
+
Properly implemented tools for the WhaleAnalysisCrewSystem
|
3 |
+
"""
|
4 |
+
|
5 |
+
import json
|
6 |
+
import pandas as pd
|
7 |
+
from datetime import datetime
|
8 |
+
from typing import Any, Dict, List, Optional, Type
|
9 |
+
from pydantic import BaseModel, Field
|
10 |
+
import logging
|
11 |
+
|
12 |
+
from modules.api_client import ArbiscanClient, GeminiClient
|
13 |
+
from modules.data_processor import DataProcessor
|
14 |
+
from langchain.tools import BaseTool
|
15 |
+
|
16 |
+
|
17 |
+
class GetTokenTransfersInput(BaseModel):
|
18 |
+
"""Input for the get_token_transfers tool."""
|
19 |
+
address: str = Field(..., description="Wallet address to query")
|
20 |
+
contract_address: Optional[str] = Field(None, description="Optional token contract address to filter by")
|
21 |
+
|
22 |
+
|
23 |
+
# Global clients that will be used by all tools
|
24 |
+
_GLOBAL_ARBISCAN_CLIENT = None
|
25 |
+
_GLOBAL_GEMINI_CLIENT = None
|
26 |
+
_GLOBAL_DATA_PROCESSOR = None
|
27 |
+
|
28 |
+
def set_global_clients(arbiscan_client=None, gemini_client=None, data_processor=None):
|
29 |
+
"""Set global client instances that will be used by all tools"""
|
30 |
+
global _GLOBAL_ARBISCAN_CLIENT, _GLOBAL_GEMINI_CLIENT, _GLOBAL_DATA_PROCESSOR
|
31 |
+
if arbiscan_client:
|
32 |
+
_GLOBAL_ARBISCAN_CLIENT = arbiscan_client
|
33 |
+
if gemini_client:
|
34 |
+
_GLOBAL_GEMINI_CLIENT = gemini_client
|
35 |
+
if data_processor:
|
36 |
+
_GLOBAL_DATA_PROCESSOR = data_processor
|
37 |
+
|
38 |
+
class ArbiscanGetTokenTransfersTool(BaseTool):
|
39 |
+
"""Tool for fetching token transfers from Arbiscan."""
|
40 |
+
name = "arbiscan_get_token_transfers"
|
41 |
+
description = "Get ERC-20 token transfers for a specific address"
|
42 |
+
args_schema: Type[BaseModel] = GetTokenTransfersInput
|
43 |
+
|
44 |
+
def __init__(self, arbiscan_client=None):
|
45 |
+
super().__init__()
|
46 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
47 |
+
if arbiscan_client:
|
48 |
+
set_global_clients(arbiscan_client=arbiscan_client)
|
49 |
+
|
50 |
+
def _run(self, address: str, contract_address: Optional[str] = None) -> str:
|
51 |
+
global _GLOBAL_ARBISCAN_CLIENT
|
52 |
+
|
53 |
+
if not _GLOBAL_ARBISCAN_CLIENT:
|
54 |
+
return json.dumps({"error": "Arbiscan client not initialized. Please set global client first."})
|
55 |
+
|
56 |
+
try:
|
57 |
+
transfers = _GLOBAL_ARBISCAN_CLIENT.get_token_transfers(
|
58 |
+
address=address,
|
59 |
+
contract_address=contract_address
|
60 |
+
)
|
61 |
+
return json.dumps(transfers)
|
62 |
+
except Exception as e:
|
63 |
+
logging.error(f"Error in ArbiscanGetTokenTransfersTool: {str(e)}")
|
64 |
+
return json.dumps({"error": str(e)})
|
65 |
+
|
66 |
+
|
67 |
+
class GetNormalTransactionsInput(BaseModel):
|
68 |
+
"""Input for the get_normal_transactions tool."""
|
69 |
+
address: str = Field(..., description="Wallet address to query")
|
70 |
+
|
71 |
+
|
72 |
+
class ArbiscanGetNormalTransactionsTool(BaseTool):
|
73 |
+
"""Tool for fetching normal transactions from Arbiscan."""
|
74 |
+
name = "arbiscan_get_normal_transactions"
|
75 |
+
description = "Get normal transactions (ETH/ARB transfers) for a specific address"
|
76 |
+
args_schema: Type[BaseModel] = GetNormalTransactionsInput
|
77 |
+
|
78 |
+
def __init__(self, arbiscan_client=None):
|
79 |
+
super().__init__()
|
80 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
81 |
+
if arbiscan_client:
|
82 |
+
set_global_clients(arbiscan_client=arbiscan_client)
|
83 |
+
|
84 |
+
def _run(self, address: str, startblock: int = 0, endblock: int = 99999999, page: int = 1, offset: int = 10) -> str:
|
85 |
+
global _GLOBAL_ARBISCAN_CLIENT
|
86 |
+
|
87 |
+
if not _GLOBAL_ARBISCAN_CLIENT:
|
88 |
+
return json.dumps({"error": "Arbiscan client not initialized. Please set global client first."})
|
89 |
+
|
90 |
+
try:
|
91 |
+
txs = _GLOBAL_ARBISCAN_CLIENT.get_normal_transactions(
|
92 |
+
address=address,
|
93 |
+
start_block=startblock,
|
94 |
+
end_block=endblock,
|
95 |
+
page=page,
|
96 |
+
offset=offset
|
97 |
+
)
|
98 |
+
return json.dumps(txs)
|
99 |
+
except Exception as e:
|
100 |
+
logging.error(f"Error in ArbiscanGetNormalTransactionsTool: {str(e)}")
|
101 |
+
return json.dumps({"error": str(e)})
|
102 |
+
|
103 |
+
|
104 |
+
class GetInternalTransactionsInput(BaseModel):
|
105 |
+
"""Input for the get_internal_transactions tool."""
|
106 |
+
address: str = Field(..., description="Wallet address to query")
|
107 |
+
|
108 |
+
|
109 |
+
class ArbiscanGetInternalTransactionsTool(BaseTool):
|
110 |
+
"""Tool for fetching internal transactions from Arbiscan."""
|
111 |
+
name = "arbiscan_get_internal_transactions"
|
112 |
+
description = "Get internal transactions for a specific address"
|
113 |
+
args_schema: Type[BaseModel] = GetInternalTransactionsInput
|
114 |
+
|
115 |
+
def __init__(self, arbiscan_client=None):
|
116 |
+
super().__init__()
|
117 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
118 |
+
if arbiscan_client:
|
119 |
+
set_global_clients(arbiscan_client=arbiscan_client)
|
120 |
+
|
121 |
+
def _run(self, address: str, startblock: int = 0, endblock: int = 99999999, page: int = 1, offset: int = 10) -> str:
|
122 |
+
global _GLOBAL_ARBISCAN_CLIENT
|
123 |
+
|
124 |
+
if not _GLOBAL_ARBISCAN_CLIENT:
|
125 |
+
return json.dumps({"error": "Arbiscan client not initialized. Please set global client first."})
|
126 |
+
|
127 |
+
try:
|
128 |
+
txs = _GLOBAL_ARBISCAN_CLIENT.get_internal_transactions(
|
129 |
+
address=address,
|
130 |
+
start_block=startblock,
|
131 |
+
end_block=endblock,
|
132 |
+
page=page,
|
133 |
+
offset=offset
|
134 |
+
)
|
135 |
+
return json.dumps(txs)
|
136 |
+
except Exception as e:
|
137 |
+
logging.error(f"Error in ArbiscanGetInternalTransactionsTool: {str(e)}")
|
138 |
+
return json.dumps({"error": str(e)})
|
139 |
+
|
140 |
+
|
141 |
+
class FetchWhaleTransactionsInput(BaseModel):
|
142 |
+
"""Input for the fetch_whale_transactions tool."""
|
143 |
+
addresses: List[str] = Field(..., description="List of wallet addresses to query")
|
144 |
+
token_address: Optional[str] = Field(None, description="Optional token contract address to filter by")
|
145 |
+
min_token_amount: Optional[float] = Field(None, description="Minimum token amount")
|
146 |
+
min_usd_value: Optional[float] = Field(None, description="Minimum USD value")
|
147 |
+
|
148 |
+
|
149 |
+
class ArbiscanFetchWhaleTransactionsTool(BaseTool):
|
150 |
+
"""Tool for fetching whale transactions from Arbiscan."""
|
151 |
+
name = "arbiscan_fetch_whale_transactions"
|
152 |
+
description = "Fetch whale transactions for a list of addresses"
|
153 |
+
args_schema: Type[BaseModel] = FetchWhaleTransactionsInput
|
154 |
+
|
155 |
+
def __init__(self, arbiscan_client=None):
|
156 |
+
super().__init__()
|
157 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
158 |
+
if arbiscan_client:
|
159 |
+
set_global_clients(arbiscan_client=arbiscan_client)
|
160 |
+
|
161 |
+
def _run(self, addresses: List[str], token_address: Optional[str] = None,
|
162 |
+
min_token_amount: Optional[float] = None, min_usd_value: Optional[float] = None) -> str:
|
163 |
+
global _GLOBAL_ARBISCAN_CLIENT
|
164 |
+
|
165 |
+
if not _GLOBAL_ARBISCAN_CLIENT:
|
166 |
+
return json.dumps({"error": "Arbiscan client not initialized. Please set global client first."})
|
167 |
+
|
168 |
+
try:
|
169 |
+
transactions_df = _GLOBAL_ARBISCAN_CLIENT.fetch_whale_transactions(
|
170 |
+
addresses=addresses,
|
171 |
+
token_address=token_address,
|
172 |
+
min_token_amount=min_token_amount,
|
173 |
+
min_usd_value=min_usd_value,
|
174 |
+
max_pages=5 # Limit to 5 pages to prevent excessive API calls
|
175 |
+
)
|
176 |
+
return transactions_df.to_json(orient="records")
|
177 |
+
except Exception as e:
|
178 |
+
logging.error(f"Error in ArbiscanFetchWhaleTransactionsTool: {str(e)}")
|
179 |
+
return json.dumps({"error": str(e)})
|
180 |
+
|
181 |
+
|
182 |
+
class GetCurrentPriceInput(BaseModel):
|
183 |
+
"""Input for the get_current_price tool."""
|
184 |
+
symbol: str = Field(..., description="Token symbol (e.g., 'ETHUSD')")
|
185 |
+
|
186 |
+
|
187 |
+
class GeminiGetCurrentPriceTool(BaseTool):
|
188 |
+
"""Tool for getting current token price from Gemini."""
|
189 |
+
name = "gemini_get_current_price"
|
190 |
+
description = "Get the current price of a token"
|
191 |
+
args_schema: Type[BaseModel] = GetCurrentPriceInput
|
192 |
+
|
193 |
+
def __init__(self, gemini_client=None):
|
194 |
+
super().__init__()
|
195 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
196 |
+
if gemini_client:
|
197 |
+
set_global_clients(gemini_client=gemini_client)
|
198 |
+
|
199 |
+
def _run(self, symbol: str) -> str:
|
200 |
+
global _GLOBAL_GEMINI_CLIENT
|
201 |
+
|
202 |
+
if not _GLOBAL_GEMINI_CLIENT:
|
203 |
+
return json.dumps({"error": "Gemini client not initialized. Please set global client first."})
|
204 |
+
|
205 |
+
try:
|
206 |
+
price = _GLOBAL_GEMINI_CLIENT.get_current_price(symbol)
|
207 |
+
return json.dumps({"symbol": symbol, "price": price})
|
208 |
+
except Exception as e:
|
209 |
+
logging.error(f"Error in GeminiGetCurrentPriceTool: {str(e)}")
|
210 |
+
return json.dumps({"error": str(e)})
|
211 |
+
|
212 |
+
|
213 |
+
class GetHistoricalPricesInput(BaseModel):
|
214 |
+
"""Input for the get_historical_prices tool."""
|
215 |
+
symbol: str = Field(..., description="Token symbol (e.g., 'ETHUSD')")
|
216 |
+
start_time: str = Field(..., description="Start datetime in ISO format")
|
217 |
+
end_time: str = Field(..., description="End datetime in ISO format")
|
218 |
+
|
219 |
+
|
220 |
+
class GeminiGetHistoricalPricesTool(BaseTool):
|
221 |
+
"""Tool for getting historical token prices from Gemini."""
|
222 |
+
name = "gemini_get_historical_prices"
|
223 |
+
description = "Get historical prices for a token within a time range"
|
224 |
+
args_schema: Type[BaseModel] = GetHistoricalPricesInput
|
225 |
+
|
226 |
+
def __init__(self, gemini_client=None):
|
227 |
+
super().__init__()
|
228 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
229 |
+
if gemini_client:
|
230 |
+
set_global_clients(gemini_client=gemini_client)
|
231 |
+
|
232 |
+
def _run(
|
233 |
+
self,
|
234 |
+
symbol: str,
|
235 |
+
start_time: Optional[str] = None,
|
236 |
+
end_time: Optional[str] = None,
|
237 |
+
interval: str = "15m"
|
238 |
+
) -> str:
|
239 |
+
global _GLOBAL_GEMINI_CLIENT
|
240 |
+
|
241 |
+
if not _GLOBAL_GEMINI_CLIENT:
|
242 |
+
return json.dumps({"error": "Gemini client not initialized. Please set global client first."})
|
243 |
+
|
244 |
+
try:
|
245 |
+
# Convert string times to datetime if provided
|
246 |
+
start_dt = None
|
247 |
+
end_dt = None
|
248 |
+
|
249 |
+
if start_time:
|
250 |
+
start_dt = datetime.fromisoformat(start_time)
|
251 |
+
if end_time:
|
252 |
+
end_dt = datetime.fromisoformat(end_time)
|
253 |
+
|
254 |
+
prices = _GLOBAL_GEMINI_CLIENT.get_historical_prices(
|
255 |
+
symbol=symbol,
|
256 |
+
start_time=start_dt,
|
257 |
+
end_time=end_dt,
|
258 |
+
interval=interval
|
259 |
+
)
|
260 |
+
|
261 |
+
return json.dumps(prices)
|
262 |
+
except Exception as e:
|
263 |
+
logging.error(f"Error in GeminiGetHistoricalPricesTool: {str(e)}")
|
264 |
+
return json.dumps({"error": str(e)})
|
265 |
+
|
266 |
+
|
267 |
+
class IdentifyPatternsInput(BaseModel):
|
268 |
+
"""Input for the identify_patterns tool."""
|
269 |
+
transactions_json: str = Field(..., description="JSON string of transactions")
|
270 |
+
n_clusters: int = Field(3, description="Number of clusters for K-Means")
|
271 |
+
|
272 |
+
|
273 |
+
class DataProcessorIdentifyPatternsTool(BaseTool):
|
274 |
+
"""Tool for identifying trading patterns using the DataProcessor."""
|
275 |
+
name = "data_processor_identify_patterns"
|
276 |
+
description = "Identify trading patterns in a set of transactions"
|
277 |
+
args_schema: Type[BaseModel] = IdentifyPatternsInput
|
278 |
+
|
279 |
+
def __init__(self, data_processor=None):
|
280 |
+
super().__init__()
|
281 |
+
# Store reference to processor if provided, otherwise we'll use global instance
|
282 |
+
if data_processor:
|
283 |
+
set_global_clients(data_processor=data_processor)
|
284 |
+
|
285 |
+
def _run(self, transactions_json: List[Dict[str, Any]], n_clusters: int = 3) -> str:
|
286 |
+
global _GLOBAL_DATA_PROCESSOR
|
287 |
+
|
288 |
+
if not _GLOBAL_DATA_PROCESSOR:
|
289 |
+
return json.dumps({"error": "Data processor not initialized. Please set global processor first."})
|
290 |
+
|
291 |
+
try:
|
292 |
+
# Convert JSON to DataFrame
|
293 |
+
transactions_df = pd.DataFrame(transactions_json)
|
294 |
+
|
295 |
+
# Ensure required columns exist
|
296 |
+
required_columns = ['timeStamp', 'hash', 'from', 'to', 'value', 'tokenSymbol']
|
297 |
+
for col in required_columns:
|
298 |
+
if col not in transactions_df.columns:
|
299 |
+
return json.dumps({
|
300 |
+
"error": f"Missing required column: {col}",
|
301 |
+
"available_columns": list(transactions_df.columns)
|
302 |
+
})
|
303 |
+
|
304 |
+
# Run pattern identification
|
305 |
+
patterns = _GLOBAL_DATA_PROCESSOR.identify_patterns(
|
306 |
+
transactions_df=transactions_df,
|
307 |
+
n_clusters=n_clusters
|
308 |
+
)
|
309 |
+
|
310 |
+
return json.dumps(patterns)
|
311 |
+
except Exception as e:
|
312 |
+
logging.error(f"Error in DataProcessorIdentifyPatternsTool: {str(e)}")
|
313 |
+
return json.dumps({"error": str(e)})
|
314 |
+
|
315 |
+
|
316 |
+
class DetectAnomalousTransactionsInput(BaseModel):
|
317 |
+
"""Input for the detect_anomalous_transactions tool."""
|
318 |
+
transactions_json: str = Field(..., description="JSON string of transactions")
|
319 |
+
sensitivity: str = Field("Medium", description="Detection sensitivity ('Low', 'Medium', 'High')")
|
320 |
+
|
321 |
+
|
322 |
+
class DataProcessorDetectAnomalousTransactionsTool(BaseTool):
|
323 |
+
"""Tool for detecting anomalous transactions using the DataProcessor."""
|
324 |
+
name = "data_processor_detect_anomalies"
|
325 |
+
description = "Detect anomalous transactions in a dataset"
|
326 |
+
args_schema: Type[BaseModel] = DetectAnomalousTransactionsInput
|
327 |
+
|
328 |
+
def __init__(self, data_processor=None):
|
329 |
+
super().__init__()
|
330 |
+
# Store reference to processor if provided, otherwise we'll use global instance
|
331 |
+
if data_processor:
|
332 |
+
set_global_clients(data_processor=data_processor)
|
333 |
+
|
334 |
+
def _run(self, transactions_json: List[Dict[str, Any]], sensitivity: str = "Medium") -> str:
|
335 |
+
global _GLOBAL_DATA_PROCESSOR
|
336 |
+
|
337 |
+
if not _GLOBAL_DATA_PROCESSOR:
|
338 |
+
return json.dumps({"error": "Data processor not initialized. Please set global processor first."})
|
339 |
+
|
340 |
+
try:
|
341 |
+
# Convert JSON to DataFrame
|
342 |
+
transactions_df = pd.DataFrame(transactions_json)
|
343 |
+
|
344 |
+
# Ensure required columns exist
|
345 |
+
required_columns = ['timeStamp', 'hash', 'from', 'to', 'value', 'tokenSymbol']
|
346 |
+
for col in required_columns:
|
347 |
+
if col not in transactions_df.columns:
|
348 |
+
return json.dumps({
|
349 |
+
"error": f"Missing required column: {col}",
|
350 |
+
"available_columns": list(transactions_df.columns)
|
351 |
+
})
|
352 |
+
|
353 |
+
# Run anomaly detection
|
354 |
+
anomalies = _GLOBAL_DATA_PROCESSOR.detect_anomalous_transactions(
|
355 |
+
transactions_df=transactions_df,
|
356 |
+
sensitivity=sensitivity
|
357 |
+
)
|
358 |
+
|
359 |
+
return json.dumps(anomalies)
|
360 |
+
except Exception as e:
|
361 |
+
logging.error(f"Error in DataProcessorDetectAnomalousTransactionsTool: {str(e)}")
|
362 |
+
return json.dumps({"error": str(e)})
|
modules/data_processor.py
ADDED
@@ -0,0 +1,1425 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import pandas as pd
|
2 |
+
import numpy as np
|
3 |
+
from datetime import datetime, timedelta
|
4 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
5 |
+
from sklearn.cluster import KMeans, DBSCAN
|
6 |
+
from sklearn.preprocessing import StandardScaler
|
7 |
+
import plotly.graph_objects as go
|
8 |
+
import plotly.express as px
|
9 |
+
import logging
|
10 |
+
import time
|
11 |
+
|
12 |
+
class DataProcessor:
|
13 |
+
"""
|
14 |
+
Process and analyze transaction data from blockchain APIs
|
15 |
+
"""
|
16 |
+
|
17 |
+
def __init__(self):
|
18 |
+
pass
|
19 |
+
|
20 |
+
def aggregate_transactions(self,
|
21 |
+
transactions_df: pd.DataFrame,
|
22 |
+
time_window: str = 'D') -> pd.DataFrame:
|
23 |
+
"""
|
24 |
+
Aggregate transactions by time window
|
25 |
+
|
26 |
+
Args:
|
27 |
+
transactions_df: DataFrame of transactions
|
28 |
+
time_window: Time window for aggregation (e.g., 'D' for day, 'H' for hour)
|
29 |
+
|
30 |
+
Returns:
|
31 |
+
Aggregated DataFrame with transaction counts and volumes
|
32 |
+
"""
|
33 |
+
if transactions_df.empty:
|
34 |
+
return pd.DataFrame()
|
35 |
+
|
36 |
+
# Ensure timestamp column is datetime
|
37 |
+
if 'Timestamp' in transactions_df.columns:
|
38 |
+
timestamp_col = 'Timestamp'
|
39 |
+
elif 'timeStamp' in transactions_df.columns:
|
40 |
+
timestamp_col = 'timeStamp'
|
41 |
+
else:
|
42 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
43 |
+
|
44 |
+
# Ensure amount column exists
|
45 |
+
if 'Amount' in transactions_df.columns:
|
46 |
+
amount_col = 'Amount'
|
47 |
+
elif 'tokenAmount' in transactions_df.columns:
|
48 |
+
amount_col = 'tokenAmount'
|
49 |
+
elif 'value' in transactions_df.columns:
|
50 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
51 |
+
if 'tokenDecimal' in transactions_df.columns:
|
52 |
+
transactions_df['adjustedValue'] = transactions_df['value'].astype(float) / (10 ** transactions_df['tokenDecimal'].astype(int))
|
53 |
+
amount_col = 'adjustedValue'
|
54 |
+
else:
|
55 |
+
amount_col = 'value'
|
56 |
+
else:
|
57 |
+
raise ValueError("Amount column not found in transactions DataFrame")
|
58 |
+
|
59 |
+
# Resample by time window
|
60 |
+
transactions_df = transactions_df.copy()
|
61 |
+
try:
|
62 |
+
transactions_df.set_index(pd.DatetimeIndex(transactions_df[timestamp_col]), inplace=True)
|
63 |
+
except Exception as e:
|
64 |
+
print(f"Error setting DatetimeIndex: {str(e)}")
|
65 |
+
# Create a safe index as a fallback
|
66 |
+
transactions_df['safe_timestamp'] = pd.date_range(
|
67 |
+
start='2025-01-01',
|
68 |
+
periods=len(transactions_df),
|
69 |
+
freq='H'
|
70 |
+
)
|
71 |
+
transactions_df.set_index('safe_timestamp', inplace=True)
|
72 |
+
|
73 |
+
# Identify buy vs sell transactions based on 'from' and 'to' addresses
|
74 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
75 |
+
from_col, to_col = 'From', 'To'
|
76 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
77 |
+
from_col, to_col = 'from', 'to'
|
78 |
+
else:
|
79 |
+
# If we can't determine direction, just aggregate total volume
|
80 |
+
agg_df = transactions_df.resample(time_window).agg({
|
81 |
+
amount_col: 'sum',
|
82 |
+
timestamp_col: 'count'
|
83 |
+
})
|
84 |
+
agg_df.columns = ['Volume', 'Count']
|
85 |
+
return agg_df.reset_index()
|
86 |
+
|
87 |
+
# Calculate net flow for each wallet address (positive = inflow, negative = outflow)
|
88 |
+
wallet_addresses = set(transactions_df[from_col].unique()) | set(transactions_df[to_col].unique())
|
89 |
+
|
90 |
+
results = []
|
91 |
+
for wallet in wallet_addresses:
|
92 |
+
wallet_df = transactions_df.copy()
|
93 |
+
|
94 |
+
# Mark transactions as inflow or outflow
|
95 |
+
wallet_df['Direction'] = 'Unknown'
|
96 |
+
wallet_df.loc[wallet_df[to_col] == wallet, 'Direction'] = 'In'
|
97 |
+
wallet_df.loc[wallet_df[from_col] == wallet, 'Direction'] = 'Out'
|
98 |
+
|
99 |
+
# Calculate net flow
|
100 |
+
wallet_df['NetFlow'] = wallet_df[amount_col]
|
101 |
+
wallet_df.loc[wallet_df['Direction'] == 'Out', 'NetFlow'] = -wallet_df.loc[wallet_df['Direction'] == 'Out', amount_col]
|
102 |
+
|
103 |
+
# Aggregate by time window
|
104 |
+
wallet_agg = wallet_df.resample(time_window).agg({
|
105 |
+
'NetFlow': 'sum',
|
106 |
+
timestamp_col: 'count'
|
107 |
+
})
|
108 |
+
wallet_agg.columns = ['NetFlow', 'Count']
|
109 |
+
wallet_agg['Wallet'] = wallet
|
110 |
+
|
111 |
+
results.append(wallet_agg.reset_index())
|
112 |
+
|
113 |
+
if not results:
|
114 |
+
return pd.DataFrame()
|
115 |
+
|
116 |
+
combined_df = pd.concat(results, ignore_index=True)
|
117 |
+
return combined_df
|
118 |
+
|
119 |
+
# Cache for pattern identification to avoid repeating expensive calculations
|
120 |
+
_pattern_cache = {}
|
121 |
+
|
122 |
+
def identify_patterns(self,
|
123 |
+
transactions_df: pd.DataFrame,
|
124 |
+
n_clusters: int = 3) -> List[Dict[str, Any]]:
|
125 |
+
"""
|
126 |
+
Identify trading patterns using clustering algorithms
|
127 |
+
|
128 |
+
Args:
|
129 |
+
transactions_df: DataFrame of transactions
|
130 |
+
n_clusters: Number of clusters to identify
|
131 |
+
|
132 |
+
Returns:
|
133 |
+
List of pattern dictionaries containing name, description, and confidence
|
134 |
+
"""
|
135 |
+
# Check for empty data early to avoid processing
|
136 |
+
if transactions_df.empty:
|
137 |
+
return []
|
138 |
+
|
139 |
+
# Create a cache key based on DataFrame hash and number of clusters
|
140 |
+
try:
|
141 |
+
cache_key = f"{hash(tuple(transactions_df.columns))}_{len(transactions_df)}_{n_clusters}"
|
142 |
+
|
143 |
+
# Check cache first
|
144 |
+
if cache_key in self._pattern_cache:
|
145 |
+
return self._pattern_cache[cache_key]
|
146 |
+
except Exception:
|
147 |
+
# If hashing fails, proceed without caching
|
148 |
+
cache_key = None
|
149 |
+
|
150 |
+
try:
|
151 |
+
# Create a reference instead of a deep copy to improve memory usage
|
152 |
+
df = transactions_df
|
153 |
+
|
154 |
+
# Ensure timestamp column exists - optimize column presence checks
|
155 |
+
timestamp_cols = ['Timestamp', 'timeStamp']
|
156 |
+
timestamp_col = next((col for col in timestamp_cols if col in df.columns), None)
|
157 |
+
|
158 |
+
if timestamp_col:
|
159 |
+
# Convert timestamp only if needed
|
160 |
+
if not pd.api.types.is_datetime64_any_dtype(df[timestamp_col]):
|
161 |
+
try:
|
162 |
+
# Use vectorized operations instead of astype where possible
|
163 |
+
if df[timestamp_col].dtype == 'object':
|
164 |
+
df[timestamp_col] = pd.to_datetime(df[timestamp_col], errors='coerce')
|
165 |
+
else:
|
166 |
+
df[timestamp_col] = pd.to_datetime(df[timestamp_col], unit='s', errors='coerce')
|
167 |
+
except Exception as e:
|
168 |
+
# Create a date range index as fallback
|
169 |
+
df['dummy_timestamp'] = pd.date_range(start='2025-01-01', periods=len(df), freq='H')
|
170 |
+
timestamp_col = 'dummy_timestamp'
|
171 |
+
else:
|
172 |
+
# If no timestamp column, create a dummy index
|
173 |
+
df['dummy_timestamp'] = pd.date_range(start='2025-01-01', periods=len(df), freq='H')
|
174 |
+
timestamp_col = 'dummy_timestamp'
|
175 |
+
|
176 |
+
# Efficiently calculate floor hour using vectorized operations
|
177 |
+
df['hour'] = df[timestamp_col].dt.floor('H')
|
178 |
+
|
179 |
+
# Check for address columns efficiently
|
180 |
+
if 'From' in df.columns and 'To' in df.columns:
|
181 |
+
from_col, to_col = 'From', 'To'
|
182 |
+
elif 'from' in df.columns and 'to' in df.columns:
|
183 |
+
from_col, to_col = 'from', 'to'
|
184 |
+
else:
|
185 |
+
# Create dummy addresses only if necessary
|
186 |
+
df['from'] = [f'0x{i:040x}' for i in range(len(df))]
|
187 |
+
df['to'] = [f'0x{(i+1):040x}' for i in range(len(df))]
|
188 |
+
from_col, to_col = 'from', 'to'
|
189 |
+
|
190 |
+
# Efficiently determine amount column
|
191 |
+
amount_cols = ['Amount', 'tokenAmount', 'value', 'adjustedValue']
|
192 |
+
amount_col = next((col for col in amount_cols if col in df.columns), None)
|
193 |
+
|
194 |
+
if not amount_col:
|
195 |
+
# Handle special case for token values with decimals
|
196 |
+
if 'value' in df.columns and 'tokenDecimal' in df.columns:
|
197 |
+
# Vectorized calculation for improved performance
|
198 |
+
try:
|
199 |
+
# Ensure values are numeric
|
200 |
+
df['value_numeric'] = pd.to_numeric(df['value'], errors='coerce')
|
201 |
+
df['tokenDecimal_numeric'] = pd.to_numeric(df['tokenDecimal'], errors='coerce').fillna(18)
|
202 |
+
df['adjustedValue'] = df['value_numeric'] / (10 ** df['tokenDecimal_numeric'])
|
203 |
+
amount_col = 'adjustedValue'
|
204 |
+
except Exception as e:
|
205 |
+
logging.warning(f"Error converting values: {e}")
|
206 |
+
df['dummy_amount'] = 1.0
|
207 |
+
amount_col = 'dummy_amount'
|
208 |
+
else:
|
209 |
+
# Fallback to dummy values
|
210 |
+
df['dummy_amount'] = 1.0
|
211 |
+
amount_col = 'dummy_amount'
|
212 |
+
|
213 |
+
# Ensure the amount column is numeric
|
214 |
+
try:
|
215 |
+
if amount_col in df.columns:
|
216 |
+
df[f"{amount_col}_numeric"] = pd.to_numeric(df[amount_col], errors='coerce').fillna(0)
|
217 |
+
amount_col = f"{amount_col}_numeric"
|
218 |
+
except Exception:
|
219 |
+
# If conversion fails, create a dummy numeric column
|
220 |
+
df['safe_amount'] = 1.0
|
221 |
+
amount_col = 'safe_amount'
|
222 |
+
|
223 |
+
# Calculate metrics using optimized groupby operations
|
224 |
+
# Use a more efficient approach with built-in pandas aggregation
|
225 |
+
agg_df = df.groupby('hour').agg(
|
226 |
+
Count=pd.NamedAgg(column=from_col, aggfunc='count'),
|
227 |
+
).reset_index()
|
228 |
+
|
229 |
+
# For NetFlow calculation, we need an additional pass
|
230 |
+
# This uses a more efficient calculation method
|
231 |
+
def calc_netflow(group):
|
232 |
+
# Use optimized filtering and calculations for better performance
|
233 |
+
first_to = group[to_col].iloc[0] if len(group) > 0 else None
|
234 |
+
first_from = group[from_col].iloc[0] if len(group) > 0 else None
|
235 |
+
|
236 |
+
if first_to is not None and first_from is not None:
|
237 |
+
# Ensure values are converted to numeric before summing
|
238 |
+
try:
|
239 |
+
# Convert to numeric with pd.to_numeric, coerce errors to NaN
|
240 |
+
total_in = pd.to_numeric(group.loc[group[to_col] == first_to, amount_col], errors='coerce').sum()
|
241 |
+
total_out = pd.to_numeric(group.loc[group[from_col] == first_from, amount_col], errors='coerce').sum()
|
242 |
+
# Replace NaN with 0 to avoid propagation
|
243 |
+
if pd.isna(total_in): total_in = 0.0
|
244 |
+
if pd.isna(total_out): total_out = 0.0
|
245 |
+
return float(total_in) - float(total_out)
|
246 |
+
except Exception as e:
|
247 |
+
import logging
|
248 |
+
logging.debug(f"Error converting values to numeric: {e}")
|
249 |
+
return 0.0
|
250 |
+
return 0.0
|
251 |
+
|
252 |
+
# Calculate NetFlow using apply instead of loop
|
253 |
+
netflows = df.groupby('hour').apply(calc_netflow)
|
254 |
+
agg_df['NetFlow'] = netflows.values
|
255 |
+
|
256 |
+
# Early return if not enough data for clustering
|
257 |
+
if agg_df.empty or len(agg_df) < n_clusters:
|
258 |
+
return []
|
259 |
+
|
260 |
+
# Ensure we don't have too many clusters for the dataset
|
261 |
+
actual_n_clusters = min(n_clusters, max(2, len(agg_df) // 2))
|
262 |
+
|
263 |
+
# Prepare features for clustering - with careful type handling
|
264 |
+
try:
|
265 |
+
if 'NetFlow' in agg_df.columns:
|
266 |
+
# Ensure NetFlow is numeric
|
267 |
+
agg_df['NetFlow'] = pd.to_numeric(agg_df['NetFlow'], errors='coerce').fillna(0)
|
268 |
+
features = agg_df[['NetFlow', 'Count']].copy()
|
269 |
+
primary_metric = 'NetFlow'
|
270 |
+
else:
|
271 |
+
# Calculate Volume if needed
|
272 |
+
if 'Volume' not in agg_df.columns and amount_col in df.columns:
|
273 |
+
# Calculate volume with numeric conversion
|
274 |
+
volume_by_hour = pd.to_numeric(df[amount_col], errors='coerce').fillna(0).groupby(df['hour']).sum()
|
275 |
+
agg_df['Volume'] = agg_df['hour'].map(volume_by_hour)
|
276 |
+
|
277 |
+
# Ensure Volume exists and is numeric
|
278 |
+
if 'Volume' not in agg_df.columns:
|
279 |
+
agg_df['Volume'] = 1.0 # Default value if calculation failed
|
280 |
+
else:
|
281 |
+
agg_df['Volume'] = pd.to_numeric(agg_df['Volume'], errors='coerce').fillna(1.0)
|
282 |
+
|
283 |
+
# Ensure Count is numeric
|
284 |
+
agg_df['Count'] = pd.to_numeric(agg_df['Count'], errors='coerce').fillna(1.0)
|
285 |
+
|
286 |
+
features = agg_df[['Volume', 'Count']].copy()
|
287 |
+
primary_metric = 'Volume'
|
288 |
+
|
289 |
+
# Final check to ensure features are numeric
|
290 |
+
for col in features.columns:
|
291 |
+
features[col] = pd.to_numeric(features[col], errors='coerce').fillna(0)
|
292 |
+
except Exception as e:
|
293 |
+
logging.warning(f"Error preparing clustering features: {e}")
|
294 |
+
# Create safe dummy features if everything else fails
|
295 |
+
agg_df['SafeFeature'] = 1.0
|
296 |
+
agg_df['Count'] = 1.0
|
297 |
+
features = agg_df[['SafeFeature', 'Count']].copy()
|
298 |
+
primary_metric = 'SafeFeature'
|
299 |
+
|
300 |
+
# Scale features - import only when needed for efficiency
|
301 |
+
from sklearn.preprocessing import StandardScaler
|
302 |
+
scaler = StandardScaler()
|
303 |
+
scaled_features = scaler.fit_transform(features)
|
304 |
+
|
305 |
+
# Use K-Means with reduced complexity
|
306 |
+
from sklearn.cluster import KMeans
|
307 |
+
kmeans = KMeans(n_clusters=actual_n_clusters, random_state=42, n_init=10, max_iter=100)
|
308 |
+
agg_df['Cluster'] = kmeans.fit_predict(scaled_features)
|
309 |
+
|
310 |
+
# Calculate time-based metrics from the hour column directly
|
311 |
+
if 'hour' in agg_df.columns:
|
312 |
+
try:
|
313 |
+
# Convert to datetime for hour and day extraction if needed
|
314 |
+
hour_series = pd.to_datetime(agg_df['hour'])
|
315 |
+
agg_df['Hour'] = hour_series.dt.hour
|
316 |
+
agg_df['Day'] = hour_series.dt.dayofweek
|
317 |
+
except Exception:
|
318 |
+
# Fallback for non-convertible data
|
319 |
+
agg_df['Hour'] = 0
|
320 |
+
agg_df['Day'] = 0
|
321 |
+
else:
|
322 |
+
# Default values if no hour column
|
323 |
+
agg_df['Hour'] = 0
|
324 |
+
agg_df['Day'] = 0
|
325 |
+
|
326 |
+
# Identify patterns efficiently
|
327 |
+
patterns = []
|
328 |
+
for i in range(actual_n_clusters):
|
329 |
+
# Use boolean indexing for better performance
|
330 |
+
cluster_mask = agg_df['Cluster'] == i
|
331 |
+
cluster_df = agg_df[cluster_mask]
|
332 |
+
|
333 |
+
if len(cluster_df) == 0:
|
334 |
+
continue
|
335 |
+
|
336 |
+
if primary_metric == 'NetFlow':
|
337 |
+
# Use numpy methods for faster calculation
|
338 |
+
avg_flow = cluster_df['NetFlow'].mean()
|
339 |
+
flow_std = cluster_df['NetFlow'].std()
|
340 |
+
behavior = "Accumulation" if avg_flow > 0 else "Distribution"
|
341 |
+
volume_metric = f"Net Flow: {avg_flow:.2f} Β± {flow_std:.2f}"
|
342 |
+
else:
|
343 |
+
# Use Volume metrics - optimize to avoid redundant calculations
|
344 |
+
avg_volume = cluster_df['Volume'].mean() if 'Volume' in cluster_df else 0
|
345 |
+
volume_std = cluster_df['Volume'].std() if 'Volume' in cluster_df else 0
|
346 |
+
behavior = "High Volume" if 'Volume' in agg_df and avg_volume > agg_df['Volume'].mean() else "Low Volume"
|
347 |
+
volume_metric = f"Volume: {avg_volume:.2f} Β± {volume_std:.2f}"
|
348 |
+
|
349 |
+
# Pattern characteristics
|
350 |
+
pattern_metrics = {
|
351 |
+
"avg_flow": avg_flow,
|
352 |
+
"flow_std": flow_std,
|
353 |
+
"avg_count": cluster_df['Count'].mean(),
|
354 |
+
"max_flow": cluster_df['NetFlow'].max(),
|
355 |
+
"min_flow": cluster_df['NetFlow'].min(),
|
356 |
+
"common_hour": cluster_df['Hour'].mode()[0] if not cluster_df['Hour'].empty else None,
|
357 |
+
"common_day": cluster_df['Day'].mode()[0] if not cluster_df['Day'].empty else None
|
358 |
+
}
|
359 |
+
|
360 |
+
# Enhanced confidence calculation
|
361 |
+
if primary_metric == 'NetFlow':
|
362 |
+
# Calculate within-cluster variance as a percentage of total variance
|
363 |
+
cluster_variance = cluster_df['NetFlow'].var()
|
364 |
+
total_variance = agg_df['NetFlow'].var() or 1 # Avoid division by zero
|
365 |
+
confidence = max(0.4, min(0.95, 1 - (cluster_variance / total_variance)))
|
366 |
+
else:
|
367 |
+
# Calculate within-cluster variance as a percentage of total variance
|
368 |
+
cluster_variance = cluster_df['Volume'].var()
|
369 |
+
total_variance = agg_df['Volume'].var() or 1 # Avoid division by zero
|
370 |
+
confidence = max(0.4, min(0.95, 1 - (cluster_variance / total_variance)))
|
371 |
+
|
372 |
+
# Create enhanced pattern charts - Main Chart
|
373 |
+
if primary_metric == 'NetFlow':
|
374 |
+
main_fig = px.scatter(cluster_df, x=cluster_df.index, y='NetFlow',
|
375 |
+
size='Count', color='Cluster',
|
376 |
+
title=f"Pattern {i+1}: {behavior}",
|
377 |
+
labels={'NetFlow': 'Net Token Flow', 'index': 'Time'},
|
378 |
+
color_discrete_sequence=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd'])
|
379 |
+
|
380 |
+
# Add a trend line
|
381 |
+
main_fig.add_trace(go.Scatter(
|
382 |
+
x=cluster_df.index,
|
383 |
+
y=cluster_df['NetFlow'].rolling(window=3, min_periods=1).mean(),
|
384 |
+
mode='lines',
|
385 |
+
name='Trend',
|
386 |
+
line=dict(width=2, dash='dash', color='rgba(0,0,0,0.5)')
|
387 |
+
))
|
388 |
+
|
389 |
+
# Add a zero reference line
|
390 |
+
main_fig.add_shape(
|
391 |
+
type="line",
|
392 |
+
x0=cluster_df.index.min(),
|
393 |
+
y0=0,
|
394 |
+
x1=cluster_df.index.max(),
|
395 |
+
y1=0,
|
396 |
+
line=dict(color="red", width=1, dash="dot"),
|
397 |
+
)
|
398 |
+
else:
|
399 |
+
main_fig = px.scatter(cluster_df, x=cluster_df.index, y='Volume',
|
400 |
+
size='Count', color='Cluster',
|
401 |
+
title=f"Pattern {i+1}: {behavior}",
|
402 |
+
labels={'Volume': 'Transaction Volume', 'index': 'Time'},
|
403 |
+
color_discrete_sequence=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd'])
|
404 |
+
|
405 |
+
# Add a trend line
|
406 |
+
main_fig.add_trace(go.Scatter(
|
407 |
+
x=cluster_df.index,
|
408 |
+
y=cluster_df['Volume'].rolling(window=3, min_periods=1).mean(),
|
409 |
+
mode='lines',
|
410 |
+
name='Trend',
|
411 |
+
line=dict(width=2, dash='dash', color='rgba(0,0,0,0.5)')
|
412 |
+
))
|
413 |
+
|
414 |
+
main_fig.update_layout(
|
415 |
+
template="plotly_white",
|
416 |
+
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1),
|
417 |
+
margin=dict(l=20, r=20, t=50, b=20),
|
418 |
+
height=400
|
419 |
+
)
|
420 |
+
|
421 |
+
# Create hourly distribution chart
|
422 |
+
hour_counts = cluster_df.groupby('Hour')['Count'].sum().reindex(range(24), fill_value=0)
|
423 |
+
hour_fig = px.bar(x=hour_counts.index, y=hour_counts.values,
|
424 |
+
title="Hourly Distribution",
|
425 |
+
labels={'x': 'Hour of Day', 'y': 'Transaction Count'},
|
426 |
+
color_discrete_sequence=['#1f77b4'])
|
427 |
+
hour_fig.update_layout(template="plotly_white", height=300)
|
428 |
+
|
429 |
+
# Create volume/flow distribution chart
|
430 |
+
if primary_metric == 'NetFlow':
|
431 |
+
hist_data = cluster_df['NetFlow']
|
432 |
+
hist_title = "Net Flow Distribution"
|
433 |
+
hist_label = "Net Flow"
|
434 |
+
else:
|
435 |
+
hist_data = cluster_df['Volume']
|
436 |
+
hist_title = "Volume Distribution"
|
437 |
+
hist_label = "Volume"
|
438 |
+
|
439 |
+
dist_fig = px.histogram(hist_data,
|
440 |
+
title=hist_title,
|
441 |
+
labels={'value': hist_label, 'count': 'Frequency'},
|
442 |
+
color_discrete_sequence=['#2ca02c'])
|
443 |
+
dist_fig.update_layout(template="plotly_white", height=300)
|
444 |
+
|
445 |
+
# Find related transactions
|
446 |
+
if not transactions_df.empty:
|
447 |
+
# Get timestamps from this cluster
|
448 |
+
cluster_times = pd.to_datetime(cluster_df.index)
|
449 |
+
# Create time windows for matching
|
450 |
+
time_windows = [(t - pd.Timedelta(hours=1), t + pd.Timedelta(hours=1)) for t in cluster_times]
|
451 |
+
|
452 |
+
# Find transactions within these time windows
|
453 |
+
pattern_txs = transactions_df[transactions_df[timestamp_col].apply(
|
454 |
+
lambda x: any((start <= x <= end) for start, end in time_windows)
|
455 |
+
)].copy()
|
456 |
+
|
457 |
+
# If we have too many, sample them
|
458 |
+
if len(pattern_txs) > 10:
|
459 |
+
pattern_txs = pattern_txs.sample(10)
|
460 |
+
|
461 |
+
# If we have too few, just sample from all transactions
|
462 |
+
if len(pattern_txs) < 5 and len(transactions_df) >= 5:
|
463 |
+
pattern_txs = transactions_df.sample(min(5, len(transactions_df)))
|
464 |
+
else:
|
465 |
+
pattern_txs = pd.DataFrame()
|
466 |
+
|
467 |
+
# Comprehensive pattern dictionary
|
468 |
+
pattern = {
|
469 |
+
"name": behavior,
|
470 |
+
"description": f"This pattern shows {behavior.lower()} activity.",
|
471 |
+
"strategy": "Unknown",
|
472 |
+
"risk_profile": "Unknown",
|
473 |
+
"time_insight": "Unknown",
|
474 |
+
"cluster_id": i,
|
475 |
+
"metrics": pattern_metrics,
|
476 |
+
"occurrence_count": len(cluster_df),
|
477 |
+
"volume_metric": volume_metric,
|
478 |
+
"confidence": confidence,
|
479 |
+
"impact": 0.0,
|
480 |
+
"charts": {
|
481 |
+
"main": main_fig,
|
482 |
+
"hourly_distribution": hour_fig,
|
483 |
+
"value_distribution": dist_fig
|
484 |
+
},
|
485 |
+
"examples": pattern_txs
|
486 |
+
}
|
487 |
+
|
488 |
+
patterns.append(pattern)
|
489 |
+
|
490 |
+
# Cache results for future reuse
|
491 |
+
if cache_key:
|
492 |
+
self._pattern_cache[cache_key] = patterns
|
493 |
+
|
494 |
+
return patterns
|
495 |
+
|
496 |
+
except Exception as e:
|
497 |
+
import logging
|
498 |
+
logging.warning(f"Error during pattern identification: {str(e)}")
|
499 |
+
return []
|
500 |
+
|
501 |
+
# Create enhanced pattern detection method with visualization capabilities
|
502 |
+
if primary_metric == 'NetFlow':
|
503 |
+
main_fig = px.scatter(cluster_df, x=cluster_df.index, y='NetFlow',
|
504 |
+
size='Count', color='Cluster',
|
505 |
+
title=f"Pattern {i+1}: {behavior}",
|
506 |
+
labels={'NetFlow': 'Net Token Flow', 'index': 'Time'},
|
507 |
+
color_discrete_sequence=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd'])
|
508 |
+
|
509 |
+
# Add a trend line
|
510 |
+
main_fig.add_trace(go.Scatter(
|
511 |
+
x=cluster_df.index,
|
512 |
+
y=cluster_df['NetFlow'].rolling(window=3, min_periods=1).mean(),
|
513 |
+
mode='lines',
|
514 |
+
name='Trend',
|
515 |
+
line=dict(width=2, dash='dash', color='rgba(0,0,0,0.5)')
|
516 |
+
))
|
517 |
+
|
518 |
+
# Add a zero reference line
|
519 |
+
main_fig.add_shape(
|
520 |
+
type="line",
|
521 |
+
x0=cluster_df.index.min(),
|
522 |
+
y0=0,
|
523 |
+
x1=cluster_df.index.max(),
|
524 |
+
y1=0,
|
525 |
+
line=dict(color="red", width=1, dash="dot"),
|
526 |
+
)
|
527 |
+
else:
|
528 |
+
main_fig = px.scatter(cluster_df, x=cluster_df.index, y='Volume',
|
529 |
+
size='Count', color='Cluster',
|
530 |
+
title=f"Pattern {i+1}: {behavior}",
|
531 |
+
labels={'Volume': 'Transaction Volume', 'index': 'Time'},
|
532 |
+
color_discrete_sequence=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd'])
|
533 |
+
|
534 |
+
# Add a trend line
|
535 |
+
main_fig.add_trace(go.Scatter(
|
536 |
+
x=cluster_df.index,
|
537 |
+
y=cluster_df['Volume'].rolling(window=3, min_periods=1).mean(),
|
538 |
+
mode='lines',
|
539 |
+
name='Trend',
|
540 |
+
line=dict(width=2, dash='dash', color='rgba(0,0,0,0.5)')
|
541 |
+
))
|
542 |
+
|
543 |
+
main_fig.update_layout(
|
544 |
+
template="plotly_white",
|
545 |
+
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1),
|
546 |
+
margin=dict(l=20, r=20, t=50, b=20),
|
547 |
+
height=400
|
548 |
+
)
|
549 |
+
|
550 |
+
# Create hourly distribution chart
|
551 |
+
hour_counts = cluster_df.groupby('Hour')['Count'].sum().reindex(range(24), fill_value=0)
|
552 |
+
hour_fig = px.bar(x=hour_counts.index, y=hour_counts.values,
|
553 |
+
title="Hourly Distribution",
|
554 |
+
labels={'x': 'Hour of Day', 'y': 'Transaction Count'},
|
555 |
+
color_discrete_sequence=['#1f77b4'])
|
556 |
+
hour_fig.update_layout(template="plotly_white", height=300)
|
557 |
+
|
558 |
+
# Create volume/flow distribution chart
|
559 |
+
if primary_metric == 'NetFlow':
|
560 |
+
hist_data = cluster_df['NetFlow']
|
561 |
+
hist_title = "Net Flow Distribution"
|
562 |
+
hist_label = "Net Flow"
|
563 |
+
else:
|
564 |
+
hist_data = cluster_df['Volume']
|
565 |
+
hist_title = "Volume Distribution"
|
566 |
+
hist_label = "Volume"
|
567 |
+
|
568 |
+
dist_fig = px.histogram(hist_data,
|
569 |
+
title=hist_title,
|
570 |
+
labels={'value': hist_label, 'count': 'Frequency'},
|
571 |
+
color_discrete_sequence=['#2ca02c'])
|
572 |
+
dist_fig.update_layout(template="plotly_white", height=300)
|
573 |
+
|
574 |
+
# Find related transactions
|
575 |
+
if not transactions_df.empty:
|
576 |
+
# Get timestamps from this cluster
|
577 |
+
cluster_times = pd.to_datetime(cluster_df.index)
|
578 |
+
# Create time windows for matching
|
579 |
+
time_windows = [(t - pd.Timedelta(hours=1), t + pd.Timedelta(hours=1)) for t in cluster_times]
|
580 |
+
|
581 |
+
# Find transactions within these time windows
|
582 |
+
pattern_txs = transactions_df[transactions_df[timestamp_col].apply(
|
583 |
+
lambda x: any((start <= x <= end) for start, end in time_windows)
|
584 |
+
)].copy()
|
585 |
+
|
586 |
+
# If we have too many, sample them
|
587 |
+
if len(pattern_txs) > 10:
|
588 |
+
pattern_txs = pattern_txs.sample(10)
|
589 |
+
|
590 |
+
# If we have too few, just sample from all transactions
|
591 |
+
if len(pattern_txs) < 5 and len(transactions_df) >= 5:
|
592 |
+
pattern_txs = transactions_df.sample(min(5, len(transactions_df)))
|
593 |
+
else:
|
594 |
+
pattern_txs = pd.DataFrame()
|
595 |
+
|
596 |
+
# Comprehensive pattern dictionary
|
597 |
+
pattern = {
|
598 |
+
"name": behavior,
|
599 |
+
"description": description,
|
600 |
+
"strategy": strategy,
|
601 |
+
"risk_profile": risk_profile,
|
602 |
+
"time_insight": time_insight,
|
603 |
+
"cluster_id": i,
|
604 |
+
"metrics": pattern_metrics,
|
605 |
+
"occurrence_count": len(cluster_df),
|
606 |
+
"volume_metric": volume_metric,
|
607 |
+
"confidence": confidence,
|
608 |
+
"charts": {
|
609 |
+
"main": main_fig,
|
610 |
+
"hourly_distribution": hour_fig,
|
611 |
+
"value_distribution": dist_fig
|
612 |
+
},
|
613 |
+
"examples": pattern_txs
|
614 |
+
}
|
615 |
+
|
616 |
+
patterns.append(pattern)
|
617 |
+
|
618 |
+
return patterns
|
619 |
+
|
620 |
+
def detect_anomalous_transactions(self,
|
621 |
+
transactions_df: pd.DataFrame,
|
622 |
+
sensitivity: str = "Medium") -> pd.DataFrame:
|
623 |
+
"""
|
624 |
+
Detect anomalous transactions using statistical methods
|
625 |
+
|
626 |
+
Args:
|
627 |
+
transactions_df: DataFrame of transactions
|
628 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
629 |
+
|
630 |
+
Returns:
|
631 |
+
DataFrame of anomalous transactions
|
632 |
+
"""
|
633 |
+
if transactions_df.empty:
|
634 |
+
return pd.DataFrame()
|
635 |
+
|
636 |
+
# Ensure amount column exists
|
637 |
+
if 'Amount' in transactions_df.columns:
|
638 |
+
amount_col = 'Amount'
|
639 |
+
elif 'tokenAmount' in transactions_df.columns:
|
640 |
+
amount_col = 'tokenAmount'
|
641 |
+
elif 'value' in transactions_df.columns:
|
642 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
643 |
+
if 'tokenDecimal' in transactions_df.columns:
|
644 |
+
transactions_df['adjustedValue'] = transactions_df['value'].astype(float) / (10 ** transactions_df['tokenDecimal'].astype(int))
|
645 |
+
amount_col = 'adjustedValue'
|
646 |
+
else:
|
647 |
+
amount_col = 'value'
|
648 |
+
else:
|
649 |
+
raise ValueError("Amount column not found in transactions DataFrame")
|
650 |
+
|
651 |
+
# Define sensitivity thresholds
|
652 |
+
if sensitivity == "Low":
|
653 |
+
z_threshold = 3.0 # Outliers beyond 3 standard deviations
|
654 |
+
elif sensitivity == "Medium":
|
655 |
+
z_threshold = 2.5 # Outliers beyond 2.5 standard deviations
|
656 |
+
else: # High
|
657 |
+
z_threshold = 2.0 # Outliers beyond 2 standard deviations
|
658 |
+
|
659 |
+
# Calculate z-score for amount
|
660 |
+
mean_amount = transactions_df[amount_col].mean()
|
661 |
+
std_amount = transactions_df[amount_col].std()
|
662 |
+
|
663 |
+
if std_amount == 0:
|
664 |
+
return pd.DataFrame()
|
665 |
+
|
666 |
+
transactions_df['z_score'] = abs((transactions_df[amount_col] - mean_amount) / std_amount)
|
667 |
+
|
668 |
+
# Flag anomalous transactions
|
669 |
+
anomalies = transactions_df[transactions_df['z_score'] > z_threshold].copy()
|
670 |
+
|
671 |
+
# Add risk level based on z-score
|
672 |
+
anomalies['risk_level'] = 'Medium'
|
673 |
+
anomalies.loc[anomalies['z_score'] > z_threshold * 1.5, 'risk_level'] = 'High'
|
674 |
+
anomalies.loc[anomalies['z_score'] <= z_threshold * 1.2, 'risk_level'] = 'Low'
|
675 |
+
|
676 |
+
return anomalies
|
677 |
+
|
678 |
+
def analyze_price_impact(self,
|
679 |
+
transactions_df: pd.DataFrame,
|
680 |
+
price_data: Dict[str, Dict[str, Any]]) -> Dict[str, Any]:
|
681 |
+
"""
|
682 |
+
Analyze the price impact of transactions with enhanced visualizations
|
683 |
+
|
684 |
+
Args:
|
685 |
+
transactions_df: DataFrame of transactions
|
686 |
+
price_data: Dictionary of price impact data for each transaction
|
687 |
+
|
688 |
+
Returns:
|
689 |
+
Dictionary with comprehensive price impact analysis and visualizations
|
690 |
+
"""
|
691 |
+
if transactions_df.empty or not price_data:
|
692 |
+
# Create an empty chart for the default case
|
693 |
+
empty_fig = go.Figure()
|
694 |
+
empty_fig.update_layout(
|
695 |
+
title="No Price Impact Data Available",
|
696 |
+
xaxis_title="Time",
|
697 |
+
yaxis_title="Price Impact (%)",
|
698 |
+
height=400,
|
699 |
+
template="plotly_white"
|
700 |
+
)
|
701 |
+
empty_fig.add_annotation(
|
702 |
+
text="No transactions found with price impact data",
|
703 |
+
showarrow=False,
|
704 |
+
font=dict(size=14)
|
705 |
+
)
|
706 |
+
|
707 |
+
return {
|
708 |
+
'avg_impact_pct': 0,
|
709 |
+
'max_impact_pct': 0,
|
710 |
+
'min_impact_pct': 0,
|
711 |
+
'significant_moves_count': 0,
|
712 |
+
'total_transactions': 0,
|
713 |
+
'charts': {
|
714 |
+
'main_chart': empty_fig,
|
715 |
+
'impact_distribution': empty_fig,
|
716 |
+
'cumulative_impact': empty_fig,
|
717 |
+
'hourly_impact': empty_fig
|
718 |
+
},
|
719 |
+
'transactions_with_impact': pd.DataFrame(),
|
720 |
+
'insights': [],
|
721 |
+
'impact_summary': "No price impact data available"
|
722 |
+
}
|
723 |
+
|
724 |
+
# Ensure timestamp column is datetime
|
725 |
+
if 'Timestamp' in transactions_df.columns:
|
726 |
+
timestamp_col = 'Timestamp'
|
727 |
+
elif 'timeStamp' in transactions_df.columns:
|
728 |
+
timestamp_col = 'timeStamp'
|
729 |
+
# Convert timestamp to datetime if it's not already
|
730 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
731 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col], unit='s')
|
732 |
+
else:
|
733 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
734 |
+
|
735 |
+
# Combine price impact data with transactions
|
736 |
+
impact_data = []
|
737 |
+
|
738 |
+
for idx, row in transactions_df.iterrows():
|
739 |
+
tx_hash = row.get('Transaction Hash', row.get('hash', None))
|
740 |
+
if not tx_hash or tx_hash not in price_data:
|
741 |
+
continue
|
742 |
+
|
743 |
+
tx_impact = price_data[tx_hash]
|
744 |
+
|
745 |
+
if tx_impact['impact_pct'] is None:
|
746 |
+
continue
|
747 |
+
|
748 |
+
# Get token symbol if available
|
749 |
+
token_symbol = row.get('tokenSymbol', 'Unknown')
|
750 |
+
token_amount = row.get('value', 0)
|
751 |
+
if 'tokenDecimal' in row:
|
752 |
+
try:
|
753 |
+
token_amount = float(token_amount) / (10 ** int(row.get('tokenDecimal', 0)))
|
754 |
+
except (ValueError, TypeError):
|
755 |
+
token_amount = 0
|
756 |
+
|
757 |
+
impact_data.append({
|
758 |
+
'transaction_hash': tx_hash,
|
759 |
+
'timestamp': row[timestamp_col],
|
760 |
+
'pre_price': tx_impact['pre_price'],
|
761 |
+
'post_price': tx_impact['post_price'],
|
762 |
+
'impact_pct': tx_impact['impact_pct'],
|
763 |
+
'token_symbol': token_symbol,
|
764 |
+
'token_amount': token_amount,
|
765 |
+
'from': row.get('from', ''),
|
766 |
+
'to': row.get('to', ''),
|
767 |
+
'hour': row[timestamp_col].hour if isinstance(row[timestamp_col], pd.Timestamp) else 0
|
768 |
+
})
|
769 |
+
|
770 |
+
if not impact_data:
|
771 |
+
# Create an empty chart for the default case
|
772 |
+
empty_fig = go.Figure()
|
773 |
+
empty_fig.update_layout(
|
774 |
+
title="No Price Impact Data Available",
|
775 |
+
xaxis_title="Time",
|
776 |
+
yaxis_title="Price Impact (%)",
|
777 |
+
height=400,
|
778 |
+
template="plotly_white"
|
779 |
+
)
|
780 |
+
empty_fig.add_annotation(
|
781 |
+
text="No transactions found with price impact data",
|
782 |
+
showarrow=False,
|
783 |
+
font=dict(size=14)
|
784 |
+
)
|
785 |
+
|
786 |
+
return {
|
787 |
+
'avg_impact_pct': 0,
|
788 |
+
'max_impact_pct': 0,
|
789 |
+
'min_impact_pct': 0,
|
790 |
+
'significant_moves_count': 0,
|
791 |
+
'total_transactions': len(transactions_df) if not transactions_df.empty else 0,
|
792 |
+
'charts': {
|
793 |
+
'main_chart': empty_fig,
|
794 |
+
'impact_distribution': empty_fig,
|
795 |
+
'cumulative_impact': empty_fig,
|
796 |
+
'hourly_impact': empty_fig
|
797 |
+
},
|
798 |
+
'transactions_with_impact': pd.DataFrame(),
|
799 |
+
'insights': [],
|
800 |
+
'impact_summary': "No price impact data available"
|
801 |
+
}
|
802 |
+
|
803 |
+
impact_df = pd.DataFrame(impact_data)
|
804 |
+
|
805 |
+
# Calculate aggregate metrics
|
806 |
+
avg_impact = impact_df['impact_pct'].mean()
|
807 |
+
max_impact = impact_df['impact_pct'].max()
|
808 |
+
min_impact = impact_df['impact_pct'].min()
|
809 |
+
median_impact = impact_df['impact_pct'].median()
|
810 |
+
std_impact = impact_df['impact_pct'].std()
|
811 |
+
|
812 |
+
# Count significant moves (>1% impact)
|
813 |
+
significant_threshold = 1.0
|
814 |
+
high_impact_threshold = 3.0
|
815 |
+
significant_moves = len(impact_df[abs(impact_df['impact_pct']) > significant_threshold])
|
816 |
+
high_impact_moves = len(impact_df[abs(impact_df['impact_pct']) > high_impact_threshold])
|
817 |
+
positive_impacts = len(impact_df[impact_df['impact_pct'] > 0])
|
818 |
+
negative_impacts = len(impact_df[impact_df['impact_pct'] < 0])
|
819 |
+
|
820 |
+
# Calculate cumulative impact
|
821 |
+
impact_df = impact_df.sort_values('timestamp')
|
822 |
+
impact_df['cumulative_impact'] = impact_df['impact_pct'].cumsum()
|
823 |
+
|
824 |
+
# Generate insights
|
825 |
+
insights = []
|
826 |
+
|
827 |
+
# Market direction bias
|
828 |
+
if avg_impact > 0.5:
|
829 |
+
insights.append({
|
830 |
+
"title": "Positive Price Pressure",
|
831 |
+
"description": f"Transactions show an overall positive price impact of {avg_impact:.2f}%, suggesting accumulation or market strength."
|
832 |
+
})
|
833 |
+
elif avg_impact < -0.5:
|
834 |
+
insights.append({
|
835 |
+
"title": "Negative Price Pressure",
|
836 |
+
"description": f"Transactions show an overall negative price impact of {avg_impact:.2f}%, suggesting distribution or market weakness."
|
837 |
+
})
|
838 |
+
|
839 |
+
# Volatility analysis
|
840 |
+
if std_impact > 2.0:
|
841 |
+
insights.append({
|
842 |
+
"title": "High Market Volatility",
|
843 |
+
"description": f"Price impact shows high volatility (std: {std_impact:.2f}%), indicating potential market manipulation or whipsaw conditions."
|
844 |
+
})
|
845 |
+
|
846 |
+
# Significant impacts
|
847 |
+
if high_impact_moves > 0:
|
848 |
+
insights.append({
|
849 |
+
"title": "High Impact Transactions",
|
850 |
+
"description": f"Detected {high_impact_moves} high-impact transactions (>{high_impact_threshold}% price change), indicating potential market-moving activity."
|
851 |
+
})
|
852 |
+
|
853 |
+
# Temporal patterns
|
854 |
+
hourly_impact = impact_df.groupby('hour')['impact_pct'].mean()
|
855 |
+
if len(hourly_impact) > 0:
|
856 |
+
max_hour = hourly_impact.abs().idxmax()
|
857 |
+
max_hour_impact = hourly_impact[max_hour]
|
858 |
+
insights.append({
|
859 |
+
"title": "Time-Based Pattern",
|
860 |
+
"description": f"Highest price impact occurs around {max_hour}:00 with an average of {max_hour_impact:.2f}%."
|
861 |
+
})
|
862 |
+
|
863 |
+
# Create impact summary text
|
864 |
+
impact_summary = f"Analysis of {len(impact_df)} price-impacting transactions shows an average impact of {avg_impact:.2f}% "
|
865 |
+
impact_summary += f"(range: {min_impact:.2f}% to {max_impact:.2f}%). "
|
866 |
+
impact_summary += f"Found {significant_moves} significant price moves and {high_impact_moves} high-impact transactions. "
|
867 |
+
if positive_impacts > negative_impacts:
|
868 |
+
impact_summary += f"There is a bias towards positive price impact ({positive_impacts} positive vs {negative_impacts} negative)."
|
869 |
+
elif negative_impacts > positive_impacts:
|
870 |
+
impact_summary += f"There is a bias towards negative price impact ({negative_impacts} negative vs {positive_impacts} positive)."
|
871 |
+
else:
|
872 |
+
impact_summary += "The price impact is balanced between positive and negative moves."
|
873 |
+
|
874 |
+
# Create enhanced main visualization
|
875 |
+
main_fig = go.Figure()
|
876 |
+
|
877 |
+
# Add scatter plot for impact
|
878 |
+
main_fig.add_trace(go.Scatter(
|
879 |
+
x=impact_df['timestamp'],
|
880 |
+
y=impact_df['impact_pct'],
|
881 |
+
mode='markers+lines',
|
882 |
+
marker=dict(
|
883 |
+
size=impact_df['impact_pct'].abs() * 1.5 + 5,
|
884 |
+
color=impact_df['impact_pct'],
|
885 |
+
colorscale='RdBu_r',
|
886 |
+
line=dict(width=1),
|
887 |
+
symbol=['circle' if val >= 0 else 'diamond' for val in impact_df['impact_pct']]
|
888 |
+
),
|
889 |
+
text=[
|
890 |
+
f"TX: {tx[:8]}...{tx[-6:]}<br>" +
|
891 |
+
f"Impact: {impact:.2f}%<br>" +
|
892 |
+
f"Token: {token} ({amount:.4f})<br>" +
|
893 |
+
f"From: {src[:6]}...{src[-4:]}<br>" +
|
894 |
+
f"To: {dst[:6]}...{dst[-4:]}"
|
895 |
+
for tx, impact, token, amount, src, dst in zip(
|
896 |
+
impact_df['transaction_hash'],
|
897 |
+
impact_df['impact_pct'],
|
898 |
+
impact_df['token_symbol'],
|
899 |
+
impact_df['token_amount'],
|
900 |
+
impact_df['from'],
|
901 |
+
impact_df['to']
|
902 |
+
)
|
903 |
+
],
|
904 |
+
hovertemplate='%{text}<br>Time: %{x}<extra></extra>',
|
905 |
+
name='Price Impact'
|
906 |
+
))
|
907 |
+
|
908 |
+
# Add a moving average trendline
|
909 |
+
window_size = max(3, len(impact_df) // 10) # Dynamic window size
|
910 |
+
if len(impact_df) >= window_size:
|
911 |
+
impact_df['ma'] = impact_df['impact_pct'].rolling(window=window_size, min_periods=1).mean()
|
912 |
+
main_fig.add_trace(go.Scatter(
|
913 |
+
x=impact_df['timestamp'],
|
914 |
+
y=impact_df['ma'],
|
915 |
+
mode='lines',
|
916 |
+
line=dict(width=2, color='rgba(255,165,0,0.7)'),
|
917 |
+
name=f'Moving Avg ({window_size} period)'
|
918 |
+
))
|
919 |
+
|
920 |
+
# Add a zero line for reference
|
921 |
+
main_fig.add_shape(
|
922 |
+
type='line',
|
923 |
+
x0=impact_df['timestamp'].min(),
|
924 |
+
y0=0,
|
925 |
+
x1=impact_df['timestamp'].max(),
|
926 |
+
y1=0,
|
927 |
+
line=dict(color='gray', width=1, dash='dash')
|
928 |
+
)
|
929 |
+
|
930 |
+
# Add colored regions for significant impact
|
931 |
+
|
932 |
+
# Add green band for normal price movement
|
933 |
+
main_fig.add_shape(
|
934 |
+
type='rect',
|
935 |
+
x0=impact_df['timestamp'].min(),
|
936 |
+
y0=-significant_threshold,
|
937 |
+
x1=impact_df['timestamp'].max(),
|
938 |
+
y1=significant_threshold,
|
939 |
+
fillcolor='rgba(0,255,0,0.1)',
|
940 |
+
line=dict(width=0),
|
941 |
+
layer='below'
|
942 |
+
)
|
943 |
+
|
944 |
+
# Add warning bands for higher impact movements
|
945 |
+
main_fig.add_shape(
|
946 |
+
type='rect',
|
947 |
+
x0=impact_df['timestamp'].min(),
|
948 |
+
y0=significant_threshold,
|
949 |
+
x1=impact_df['timestamp'].max(),
|
950 |
+
y1=high_impact_threshold,
|
951 |
+
fillcolor='rgba(255,255,0,0.1)',
|
952 |
+
line=dict(width=0),
|
953 |
+
layer='below'
|
954 |
+
)
|
955 |
+
|
956 |
+
main_fig.add_shape(
|
957 |
+
type='rect',
|
958 |
+
x0=impact_df['timestamp'].min(),
|
959 |
+
y0=-high_impact_threshold,
|
960 |
+
x1=impact_df['timestamp'].max(),
|
961 |
+
y1=-significant_threshold,
|
962 |
+
fillcolor='rgba(255,255,0,0.1)',
|
963 |
+
line=dict(width=0),
|
964 |
+
layer='below'
|
965 |
+
)
|
966 |
+
|
967 |
+
# Add high impact regions
|
968 |
+
main_fig.add_shape(
|
969 |
+
type='rect',
|
970 |
+
x0=impact_df['timestamp'].min(),
|
971 |
+
y0=high_impact_threshold,
|
972 |
+
x1=impact_df['timestamp'].max(),
|
973 |
+
y1=max(high_impact_threshold * 2, max_impact * 1.1),
|
974 |
+
fillcolor='rgba(255,0,0,0.1)',
|
975 |
+
line=dict(width=0),
|
976 |
+
layer='below'
|
977 |
+
)
|
978 |
+
|
979 |
+
main_fig.add_shape(
|
980 |
+
type='rect',
|
981 |
+
x0=impact_df['timestamp'].min(),
|
982 |
+
y0=min(high_impact_threshold * -2, min_impact * 1.1),
|
983 |
+
x1=impact_df['timestamp'].max(),
|
984 |
+
y1=-high_impact_threshold,
|
985 |
+
fillcolor='rgba(255,0,0,0.1)',
|
986 |
+
line=dict(width=0),
|
987 |
+
layer='below'
|
988 |
+
)
|
989 |
+
|
990 |
+
main_fig.update_layout(
|
991 |
+
title='Price Impact of Whale Transactions',
|
992 |
+
xaxis_title='Timestamp',
|
993 |
+
yaxis_title='Price Impact (%)',
|
994 |
+
hovermode='closest',
|
995 |
+
template="plotly_white",
|
996 |
+
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1),
|
997 |
+
margin=dict(l=20, r=20, t=50, b=20)
|
998 |
+
)
|
999 |
+
|
1000 |
+
# Create impact distribution histogram
|
1001 |
+
dist_fig = px.histogram(
|
1002 |
+
impact_df['impact_pct'],
|
1003 |
+
nbins=20,
|
1004 |
+
labels={'value': 'Price Impact (%)', 'count': 'Frequency'},
|
1005 |
+
title='Distribution of Price Impact',
|
1006 |
+
color_discrete_sequence=['#3366CC']
|
1007 |
+
)
|
1008 |
+
|
1009 |
+
# Add a vertical line at the mean
|
1010 |
+
dist_fig.add_vline(x=avg_impact, line_dash="dash", line_color="red")
|
1011 |
+
dist_fig.add_annotation(x=avg_impact, y=0.85, yref="paper", text=f"Mean: {avg_impact:.2f}%",
|
1012 |
+
showarrow=True, arrowhead=2, arrowcolor="red", ax=40)
|
1013 |
+
|
1014 |
+
# Add a vertical line at zero
|
1015 |
+
dist_fig.add_vline(x=0, line_dash="solid", line_color="black")
|
1016 |
+
|
1017 |
+
dist_fig.update_layout(
|
1018 |
+
template="plotly_white",
|
1019 |
+
bargap=0.1,
|
1020 |
+
height=350
|
1021 |
+
)
|
1022 |
+
|
1023 |
+
# Create cumulative impact chart
|
1024 |
+
cumul_fig = go.Figure()
|
1025 |
+
cumul_fig.add_trace(go.Scatter(
|
1026 |
+
x=impact_df['timestamp'],
|
1027 |
+
y=impact_df['cumulative_impact'],
|
1028 |
+
mode='lines',
|
1029 |
+
fill='tozeroy',
|
1030 |
+
line=dict(width=2, color='#2ca02c'),
|
1031 |
+
name='Cumulative Impact'
|
1032 |
+
))
|
1033 |
+
|
1034 |
+
cumul_fig.update_layout(
|
1035 |
+
title='Cumulative Price Impact Over Time',
|
1036 |
+
xaxis_title='Timestamp',
|
1037 |
+
yaxis_title='Cumulative Price Impact (%)',
|
1038 |
+
template="plotly_white",
|
1039 |
+
height=350
|
1040 |
+
)
|
1041 |
+
|
1042 |
+
# Create hourly impact analysis
|
1043 |
+
hourly_impact = impact_df.groupby('hour')['impact_pct'].agg(['mean', 'count', 'std']).reset_index()
|
1044 |
+
hourly_impact = hourly_impact.sort_values('hour')
|
1045 |
+
|
1046 |
+
hour_fig = go.Figure()
|
1047 |
+
hour_fig.add_trace(go.Bar(
|
1048 |
+
x=hourly_impact['hour'],
|
1049 |
+
y=hourly_impact['mean'],
|
1050 |
+
error_y=dict(type='data', array=hourly_impact['std'], visible=True),
|
1051 |
+
marker_color=hourly_impact['mean'].apply(lambda x: 'green' if x > 0 else 'red'),
|
1052 |
+
name='Average Impact'
|
1053 |
+
))
|
1054 |
+
|
1055 |
+
hour_fig.update_layout(
|
1056 |
+
title='Price Impact by Hour of Day',
|
1057 |
+
xaxis_title='Hour of Day',
|
1058 |
+
yaxis_title='Average Price Impact (%)',
|
1059 |
+
template="plotly_white",
|
1060 |
+
height=350,
|
1061 |
+
xaxis=dict(tickmode='linear', tick0=0, dtick=2)
|
1062 |
+
)
|
1063 |
+
|
1064 |
+
# Join with original transactions
|
1065 |
+
transactions_df = transactions_df.copy()
|
1066 |
+
transactions_df['Timestamp_key'] = transactions_df[timestamp_col]
|
1067 |
+
impact_df['Timestamp_key'] = impact_df['timestamp']
|
1068 |
+
|
1069 |
+
merged_df = pd.merge(
|
1070 |
+
transactions_df,
|
1071 |
+
impact_df[['Timestamp_key', 'impact_pct', 'pre_price', 'post_price', 'cumulative_impact']],
|
1072 |
+
on='Timestamp_key',
|
1073 |
+
how='left'
|
1074 |
+
)
|
1075 |
+
|
1076 |
+
# Final result with enhanced output
|
1077 |
+
return {
|
1078 |
+
'avg_impact_pct': avg_impact,
|
1079 |
+
'max_impact_pct': max_impact,
|
1080 |
+
'min_impact_pct': min_impact,
|
1081 |
+
'median_impact_pct': median_impact,
|
1082 |
+
'std_impact_pct': std_impact,
|
1083 |
+
'significant_moves_count': significant_moves,
|
1084 |
+
'high_impact_moves_count': high_impact_moves,
|
1085 |
+
'positive_impacts_count': positive_impacts,
|
1086 |
+
'negative_impacts_count': negative_impacts,
|
1087 |
+
'total_transactions': len(transactions_df),
|
1088 |
+
'charts': {
|
1089 |
+
'main_chart': main_fig,
|
1090 |
+
'impact_distribution': dist_fig,
|
1091 |
+
'cumulative_impact': cumul_fig,
|
1092 |
+
'hourly_impact': hour_fig
|
1093 |
+
},
|
1094 |
+
'transactions_with_impact': merged_df,
|
1095 |
+
'insights': insights,
|
1096 |
+
'impact_summary': impact_summary
|
1097 |
+
}
|
1098 |
+
|
1099 |
+
def detect_wash_trading(self,
|
1100 |
+
transactions_df: pd.DataFrame,
|
1101 |
+
addresses: List[str],
|
1102 |
+
time_window_minutes: int = 60,
|
1103 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
1104 |
+
"""
|
1105 |
+
Detect potential wash trading between addresses
|
1106 |
+
|
1107 |
+
Args:
|
1108 |
+
transactions_df: DataFrame of transactions
|
1109 |
+
addresses: List of addresses to analyze
|
1110 |
+
time_window_minutes: Time window for detecting wash trades
|
1111 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
1112 |
+
|
1113 |
+
Returns:
|
1114 |
+
List of potential wash trading incidents
|
1115 |
+
"""
|
1116 |
+
if transactions_df.empty or not addresses:
|
1117 |
+
return []
|
1118 |
+
|
1119 |
+
# Ensure from/to columns exist
|
1120 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
1121 |
+
from_col, to_col = 'From', 'To'
|
1122 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
1123 |
+
from_col, to_col = 'from', 'to'
|
1124 |
+
else:
|
1125 |
+
raise ValueError("From/To columns not found in transactions DataFrame")
|
1126 |
+
|
1127 |
+
# Ensure timestamp column exists
|
1128 |
+
if 'Timestamp' in transactions_df.columns:
|
1129 |
+
timestamp_col = 'Timestamp'
|
1130 |
+
elif 'timeStamp' in transactions_df.columns:
|
1131 |
+
timestamp_col = 'timeStamp'
|
1132 |
+
else:
|
1133 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
1134 |
+
|
1135 |
+
# Ensure timestamp is datetime
|
1136 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
1137 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
1138 |
+
|
1139 |
+
# Define sensitivity thresholds
|
1140 |
+
if sensitivity == "Low":
|
1141 |
+
min_cycles = 3 # Minimum number of back-and-forth transactions
|
1142 |
+
max_time_diff = 120 # Maximum minutes between transactions
|
1143 |
+
elif sensitivity == "Medium":
|
1144 |
+
min_cycles = 2
|
1145 |
+
max_time_diff = 60
|
1146 |
+
else: # High
|
1147 |
+
min_cycles = 1
|
1148 |
+
max_time_diff = 30
|
1149 |
+
|
1150 |
+
# Filter transactions involving the addresses
|
1151 |
+
address_txs = transactions_df[
|
1152 |
+
(transactions_df[from_col].isin(addresses)) |
|
1153 |
+
(transactions_df[to_col].isin(addresses))
|
1154 |
+
].copy()
|
1155 |
+
|
1156 |
+
if address_txs.empty:
|
1157 |
+
return []
|
1158 |
+
|
1159 |
+
# Sort by timestamp
|
1160 |
+
address_txs = address_txs.sort_values(by=timestamp_col)
|
1161 |
+
|
1162 |
+
# Detect cycles of transactions between same addresses
|
1163 |
+
wash_trades = []
|
1164 |
+
|
1165 |
+
for addr1 in addresses:
|
1166 |
+
for addr2 in addresses:
|
1167 |
+
if addr1 == addr2:
|
1168 |
+
continue
|
1169 |
+
|
1170 |
+
# Find transactions from addr1 to addr2
|
1171 |
+
a1_to_a2 = address_txs[
|
1172 |
+
(address_txs[from_col] == addr1) &
|
1173 |
+
(address_txs[to_col] == addr2)
|
1174 |
+
]
|
1175 |
+
|
1176 |
+
# Find transactions from addr2 to addr1
|
1177 |
+
a2_to_a1 = address_txs[
|
1178 |
+
(address_txs[from_col] == addr2) &
|
1179 |
+
(address_txs[to_col] == addr1)
|
1180 |
+
]
|
1181 |
+
|
1182 |
+
if a1_to_a2.empty or a2_to_a1.empty:
|
1183 |
+
continue
|
1184 |
+
|
1185 |
+
# Check for back-and-forth patterns
|
1186 |
+
cycles = 0
|
1187 |
+
evidence = []
|
1188 |
+
|
1189 |
+
for _, tx1 in a1_to_a2.iterrows():
|
1190 |
+
tx1_time = tx1[timestamp_col]
|
1191 |
+
|
1192 |
+
# Find return transactions within the time window
|
1193 |
+
return_txs = a2_to_a1[
|
1194 |
+
(a2_to_a1[timestamp_col] > tx1_time) &
|
1195 |
+
(a2_to_a1[timestamp_col] <= tx1_time + pd.Timedelta(minutes=max_time_diff))
|
1196 |
+
]
|
1197 |
+
|
1198 |
+
if not return_txs.empty:
|
1199 |
+
cycles += 1
|
1200 |
+
evidence.append(tx1)
|
1201 |
+
evidence.append(return_txs.iloc[0])
|
1202 |
+
|
1203 |
+
if cycles >= min_cycles:
|
1204 |
+
# Create visualization
|
1205 |
+
if evidence:
|
1206 |
+
evidence_df = pd.DataFrame(evidence)
|
1207 |
+
fig = px.scatter(
|
1208 |
+
evidence_df,
|
1209 |
+
x=timestamp_col,
|
1210 |
+
y=evidence_df.get('Amount', evidence_df.get('tokenAmount', evidence_df.get('value', 0))),
|
1211 |
+
color=from_col,
|
1212 |
+
title=f"Potential Wash Trading Between {addr1[:8]}... and {addr2[:8]}..."
|
1213 |
+
)
|
1214 |
+
else:
|
1215 |
+
fig = None
|
1216 |
+
|
1217 |
+
wash_trades.append({
|
1218 |
+
"type": "Wash Trading",
|
1219 |
+
"addresses": [addr1, addr2],
|
1220 |
+
"risk_level": "High" if cycles >= min_cycles * 2 else "Medium",
|
1221 |
+
"description": f"Detected {cycles} cycles of back-and-forth transactions between addresses",
|
1222 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
1223 |
+
"title": f"Wash Trading Pattern ({cycles} cycles)",
|
1224 |
+
"evidence": pd.DataFrame(evidence) if evidence else None,
|
1225 |
+
"chart": fig
|
1226 |
+
})
|
1227 |
+
|
1228 |
+
return wash_trades
|
1229 |
+
|
1230 |
+
def detect_pump_and_dump(self,
|
1231 |
+
transactions_df: pd.DataFrame,
|
1232 |
+
price_data: Dict[str, Dict[str, Any]],
|
1233 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
1234 |
+
"""
|
1235 |
+
Detect potential pump and dump schemes
|
1236 |
+
|
1237 |
+
Args:
|
1238 |
+
transactions_df: DataFrame of transactions
|
1239 |
+
price_data: Dictionary of price impact data for each transaction
|
1240 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
1241 |
+
|
1242 |
+
Returns:
|
1243 |
+
List of potential pump and dump incidents
|
1244 |
+
"""
|
1245 |
+
if transactions_df.empty or not price_data:
|
1246 |
+
return []
|
1247 |
+
|
1248 |
+
# Ensure timestamp column exists
|
1249 |
+
if 'Timestamp' in transactions_df.columns:
|
1250 |
+
timestamp_col = 'Timestamp'
|
1251 |
+
elif 'timeStamp' in transactions_df.columns:
|
1252 |
+
timestamp_col = 'timeStamp'
|
1253 |
+
else:
|
1254 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
1255 |
+
|
1256 |
+
# Ensure from/to columns exist
|
1257 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
1258 |
+
from_col, to_col = 'From', 'To'
|
1259 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
1260 |
+
from_col, to_col = 'from', 'to'
|
1261 |
+
else:
|
1262 |
+
raise ValueError("From/To columns not found in transactions DataFrame")
|
1263 |
+
|
1264 |
+
# Ensure timestamp is datetime
|
1265 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
1266 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
1267 |
+
|
1268 |
+
# Define sensitivity thresholds
|
1269 |
+
if sensitivity == "Low":
|
1270 |
+
accumulation_threshold = 5 # Number of buys to consider accumulation
|
1271 |
+
pump_threshold = 10.0 # % price increase to trigger pump
|
1272 |
+
dump_threshold = -8.0 # % price decrease to trigger dump
|
1273 |
+
elif sensitivity == "Medium":
|
1274 |
+
accumulation_threshold = 3
|
1275 |
+
pump_threshold = 7.0
|
1276 |
+
dump_threshold = -5.0
|
1277 |
+
else: # High
|
1278 |
+
accumulation_threshold = 2
|
1279 |
+
pump_threshold = 5.0
|
1280 |
+
dump_threshold = -3.0
|
1281 |
+
|
1282 |
+
# Combine price impact data with transactions
|
1283 |
+
txs_with_impact = []
|
1284 |
+
|
1285 |
+
for idx, row in transactions_df.iterrows():
|
1286 |
+
tx_hash = row.get('Transaction Hash', row.get('hash', None))
|
1287 |
+
if not tx_hash or tx_hash not in price_data:
|
1288 |
+
continue
|
1289 |
+
|
1290 |
+
tx_impact = price_data[tx_hash]
|
1291 |
+
|
1292 |
+
if tx_impact['impact_pct'] is None:
|
1293 |
+
continue
|
1294 |
+
|
1295 |
+
txs_with_impact.append({
|
1296 |
+
'transaction_hash': tx_hash,
|
1297 |
+
'timestamp': row[timestamp_col],
|
1298 |
+
'from': row[from_col],
|
1299 |
+
'to': row[to_col],
|
1300 |
+
'pre_price': tx_impact['pre_price'],
|
1301 |
+
'post_price': tx_impact['post_price'],
|
1302 |
+
'impact_pct': tx_impact['impact_pct']
|
1303 |
+
})
|
1304 |
+
|
1305 |
+
if not txs_with_impact:
|
1306 |
+
return []
|
1307 |
+
|
1308 |
+
impact_df = pd.DataFrame(txs_with_impact)
|
1309 |
+
impact_df = impact_df.sort_values(by='timestamp')
|
1310 |
+
|
1311 |
+
# Look for accumulation phases followed by price pumps and then dumps
|
1312 |
+
pump_and_dumps = []
|
1313 |
+
|
1314 |
+
# Group by address to analyze per wallet
|
1315 |
+
address_groups = {}
|
1316 |
+
|
1317 |
+
for from_addr in impact_df['from'].unique():
|
1318 |
+
address_groups[from_addr] = impact_df[impact_df['from'] == from_addr]
|
1319 |
+
|
1320 |
+
for to_addr in impact_df['to'].unique():
|
1321 |
+
if to_addr in address_groups:
|
1322 |
+
address_groups[to_addr] = pd.concat([
|
1323 |
+
address_groups[to_addr],
|
1324 |
+
impact_df[impact_df['to'] == to_addr]
|
1325 |
+
])
|
1326 |
+
else:
|
1327 |
+
address_groups[to_addr] = impact_df[impact_df['to'] == to_addr]
|
1328 |
+
|
1329 |
+
for address, addr_df in address_groups.items():
|
1330 |
+
# Skip if not enough transactions
|
1331 |
+
if len(addr_df) < accumulation_threshold + 2:
|
1332 |
+
continue
|
1333 |
+
|
1334 |
+
# Look for continuous price increase followed by sharp drop
|
1335 |
+
window_size = min(len(addr_df), 10)
|
1336 |
+
for i in range(len(addr_df) - window_size + 1):
|
1337 |
+
window = addr_df.iloc[i:i+window_size]
|
1338 |
+
|
1339 |
+
# Get cumulative price change in window
|
1340 |
+
if len(window) >= 2:
|
1341 |
+
first_price = window.iloc[0]['pre_price']
|
1342 |
+
last_price = window.iloc[-1]['post_price']
|
1343 |
+
|
1344 |
+
if first_price is None or last_price is None:
|
1345 |
+
continue
|
1346 |
+
|
1347 |
+
cumulative_change = ((last_price - first_price) / first_price) * 100
|
1348 |
+
|
1349 |
+
# Check for pump phase
|
1350 |
+
max_price = window['post_price'].max()
|
1351 |
+
max_idx = window['post_price'].idxmax()
|
1352 |
+
|
1353 |
+
if max_idx < len(window) - 1:
|
1354 |
+
max_to_end = ((window.iloc[-1]['post_price'] - max_price) / max_price) * 100
|
1355 |
+
|
1356 |
+
# If we have a pump followed by a dump
|
1357 |
+
if (cumulative_change > pump_threshold or
|
1358 |
+
any(window['impact_pct'] > pump_threshold)) and max_to_end < dump_threshold:
|
1359 |
+
|
1360 |
+
# Create chart
|
1361 |
+
fig = go.Figure()
|
1362 |
+
|
1363 |
+
# Plot price line
|
1364 |
+
times = [t.timestamp() for t in window['timestamp']]
|
1365 |
+
prices = []
|
1366 |
+
for _, row in window.iterrows():
|
1367 |
+
prices.append(row['pre_price'])
|
1368 |
+
prices.append(row['post_price'])
|
1369 |
+
|
1370 |
+
times_expanded = []
|
1371 |
+
for t in times:
|
1372 |
+
times_expanded.append(t - 60) # 1 min before
|
1373 |
+
times_expanded.append(t + 60) # 1 min after
|
1374 |
+
|
1375 |
+
fig.add_trace(go.Scatter(
|
1376 |
+
x=times_expanded,
|
1377 |
+
y=prices,
|
1378 |
+
mode='lines+markers',
|
1379 |
+
name='Price',
|
1380 |
+
line=dict(color='blue')
|
1381 |
+
))
|
1382 |
+
|
1383 |
+
# Highlight pump and dump phases
|
1384 |
+
max_time_idx = window.index.get_loc(max_idx)
|
1385 |
+
pump_x = times_expanded[:max_time_idx*2+2]
|
1386 |
+
pump_y = prices[:max_time_idx*2+2]
|
1387 |
+
|
1388 |
+
dump_x = times_expanded[max_time_idx*2:]
|
1389 |
+
dump_y = prices[max_time_idx*2:]
|
1390 |
+
|
1391 |
+
fig.add_trace(go.Scatter(
|
1392 |
+
x=pump_x,
|
1393 |
+
y=pump_y,
|
1394 |
+
mode='lines',
|
1395 |
+
line=dict(color='green', width=3),
|
1396 |
+
name='Pump Phase'
|
1397 |
+
))
|
1398 |
+
|
1399 |
+
fig.add_trace(go.Scatter(
|
1400 |
+
x=dump_x,
|
1401 |
+
y=dump_y,
|
1402 |
+
mode='lines',
|
1403 |
+
line=dict(color='red', width=3),
|
1404 |
+
name='Dump Phase'
|
1405 |
+
))
|
1406 |
+
|
1407 |
+
fig.update_layout(
|
1408 |
+
title='Potential Pump and Dump Pattern',
|
1409 |
+
xaxis_title='Time',
|
1410 |
+
yaxis_title='Price',
|
1411 |
+
hovermode='closest'
|
1412 |
+
)
|
1413 |
+
|
1414 |
+
pump_and_dumps.append({
|
1415 |
+
"type": "Pump and Dump",
|
1416 |
+
"addresses": [address],
|
1417 |
+
"risk_level": "High" if max_to_end < dump_threshold * 1.5 else "Medium",
|
1418 |
+
"description": f"Price pumped {cumulative_change:.2f}% before dropping {max_to_end:.2f}%",
|
1419 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
1420 |
+
"title": f"Pump ({cumulative_change:.1f}%) and Dump ({max_to_end:.1f}%)",
|
1421 |
+
"evidence": window,
|
1422 |
+
"chart": fig
|
1423 |
+
})
|
1424 |
+
|
1425 |
+
return pump_and_dumps
|
modules/detection.py
ADDED
@@ -0,0 +1,684 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import pandas as pd
|
2 |
+
import numpy as np
|
3 |
+
from datetime import datetime, timedelta
|
4 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
5 |
+
import plotly.graph_objects as go
|
6 |
+
import plotly.express as px
|
7 |
+
|
8 |
+
|
9 |
+
class ManipulationDetector:
|
10 |
+
"""
|
11 |
+
Detect potential market manipulation patterns in whale transactions
|
12 |
+
"""
|
13 |
+
|
14 |
+
def __init__(self):
|
15 |
+
# Define known manipulation patterns
|
16 |
+
self.patterns = {
|
17 |
+
"pump_and_dump": {
|
18 |
+
"description": "Rapid buys followed by coordinated sell-offs, causing price to first rise then crash",
|
19 |
+
"risk_factor": 0.8
|
20 |
+
},
|
21 |
+
"wash_trading": {
|
22 |
+
"description": "Self-trading across multiple addresses to create false impression of market activity",
|
23 |
+
"risk_factor": 0.9
|
24 |
+
},
|
25 |
+
"spoofing": {
|
26 |
+
"description": "Large orders placed then canceled before execution to manipulate price",
|
27 |
+
"risk_factor": 0.7
|
28 |
+
},
|
29 |
+
"layering": {
|
30 |
+
"description": "Multiple orders at different price levels to create false impression of market depth",
|
31 |
+
"risk_factor": 0.6
|
32 |
+
},
|
33 |
+
"momentum_ignition": {
|
34 |
+
"description": "Creating sharp price moves to trigger other participants' momentum-based trading",
|
35 |
+
"risk_factor": 0.5
|
36 |
+
}
|
37 |
+
}
|
38 |
+
|
39 |
+
def detect_wash_trading(self,
|
40 |
+
transactions_df: pd.DataFrame,
|
41 |
+
addresses: List[str],
|
42 |
+
sensitivity: str = "Medium",
|
43 |
+
lookback_hours: int = 24) -> List[Dict[str, Any]]:
|
44 |
+
"""
|
45 |
+
Detect potential wash trading between addresses
|
46 |
+
|
47 |
+
Args:
|
48 |
+
transactions_df: DataFrame of transactions
|
49 |
+
addresses: List of addresses to analyze
|
50 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
51 |
+
lookback_hours: Hours to look back for wash trading patterns
|
52 |
+
|
53 |
+
Returns:
|
54 |
+
List of potential wash trading alerts
|
55 |
+
"""
|
56 |
+
if transactions_df.empty or not addresses:
|
57 |
+
return []
|
58 |
+
|
59 |
+
# Ensure from/to columns exist
|
60 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
61 |
+
from_col, to_col = 'From', 'To'
|
62 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
63 |
+
from_col, to_col = 'from', 'to'
|
64 |
+
else:
|
65 |
+
raise ValueError("From/To columns not found in transactions DataFrame")
|
66 |
+
|
67 |
+
# Ensure timestamp column exists
|
68 |
+
if 'Timestamp' in transactions_df.columns:
|
69 |
+
timestamp_col = 'Timestamp'
|
70 |
+
elif 'timeStamp' in transactions_df.columns:
|
71 |
+
timestamp_col = 'timeStamp'
|
72 |
+
else:
|
73 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
74 |
+
|
75 |
+
# Ensure timestamp is datetime
|
76 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
77 |
+
if isinstance(transactions_df[timestamp_col].iloc[0], (int, float)):
|
78 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col], unit='s')
|
79 |
+
else:
|
80 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
81 |
+
|
82 |
+
# Define sensitivity thresholds
|
83 |
+
if sensitivity == "Low":
|
84 |
+
min_cycles = 3 # Minimum number of back-and-forth transactions
|
85 |
+
max_time_diff = 120 # Maximum minutes between transactions
|
86 |
+
elif sensitivity == "Medium":
|
87 |
+
min_cycles = 2
|
88 |
+
max_time_diff = 60
|
89 |
+
else: # High
|
90 |
+
min_cycles = 1
|
91 |
+
max_time_diff = 30
|
92 |
+
|
93 |
+
# Filter transactions by lookback period
|
94 |
+
lookback_time = datetime.now() - timedelta(hours=lookback_hours)
|
95 |
+
recent_txs = transactions_df[transactions_df[timestamp_col] >= lookback_time]
|
96 |
+
|
97 |
+
if recent_txs.empty:
|
98 |
+
return []
|
99 |
+
|
100 |
+
# Filter transactions involving the addresses
|
101 |
+
address_txs = recent_txs[
|
102 |
+
(recent_txs[from_col].isin(addresses)) |
|
103 |
+
(recent_txs[to_col].isin(addresses))
|
104 |
+
].copy()
|
105 |
+
|
106 |
+
if address_txs.empty:
|
107 |
+
return []
|
108 |
+
|
109 |
+
# Sort by timestamp
|
110 |
+
address_txs = address_txs.sort_values(by=timestamp_col)
|
111 |
+
|
112 |
+
# Detect cycles of transactions between same addresses
|
113 |
+
wash_trades = []
|
114 |
+
|
115 |
+
for addr1 in addresses:
|
116 |
+
for addr2 in addresses:
|
117 |
+
if addr1 == addr2:
|
118 |
+
continue
|
119 |
+
|
120 |
+
# Find transactions from addr1 to addr2
|
121 |
+
a1_to_a2 = address_txs[
|
122 |
+
(address_txs[from_col] == addr1) &
|
123 |
+
(address_txs[to_col] == addr2)
|
124 |
+
]
|
125 |
+
|
126 |
+
# Find transactions from addr2 to addr1
|
127 |
+
a2_to_a1 = address_txs[
|
128 |
+
(address_txs[from_col] == addr2) &
|
129 |
+
(address_txs[to_col] == addr1)
|
130 |
+
]
|
131 |
+
|
132 |
+
if a1_to_a2.empty or a2_to_a1.empty:
|
133 |
+
continue
|
134 |
+
|
135 |
+
# Check for back-and-forth patterns
|
136 |
+
cycles = 0
|
137 |
+
evidence = []
|
138 |
+
|
139 |
+
for _, tx1 in a1_to_a2.iterrows():
|
140 |
+
tx1_time = tx1[timestamp_col]
|
141 |
+
|
142 |
+
# Find return transactions within the time window
|
143 |
+
return_txs = a2_to_a1[
|
144 |
+
(a2_to_a1[timestamp_col] > tx1_time) &
|
145 |
+
(a2_to_a1[timestamp_col] <= tx1_time + pd.Timedelta(minutes=max_time_diff))
|
146 |
+
]
|
147 |
+
|
148 |
+
if not return_txs.empty:
|
149 |
+
cycles += 1
|
150 |
+
evidence.append(tx1)
|
151 |
+
evidence.append(return_txs.iloc[0])
|
152 |
+
|
153 |
+
if cycles >= min_cycles:
|
154 |
+
# Create visualization
|
155 |
+
if evidence:
|
156 |
+
evidence_df = pd.DataFrame(evidence)
|
157 |
+
|
158 |
+
# Get amount column
|
159 |
+
if 'Amount' in evidence_df.columns:
|
160 |
+
amount_col = 'Amount'
|
161 |
+
elif 'tokenAmount' in evidence_df.columns:
|
162 |
+
amount_col = 'tokenAmount'
|
163 |
+
elif 'value' in evidence_df.columns:
|
164 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
165 |
+
if 'tokenDecimal' in evidence_df.columns:
|
166 |
+
evidence_df['adjustedValue'] = evidence_df['value'].astype(float) / (10 ** evidence_df['tokenDecimal'].astype(int))
|
167 |
+
amount_col = 'adjustedValue'
|
168 |
+
else:
|
169 |
+
amount_col = 'value'
|
170 |
+
else:
|
171 |
+
amount_col = None
|
172 |
+
|
173 |
+
# Create figure if amount column exists
|
174 |
+
if amount_col:
|
175 |
+
fig = px.scatter(
|
176 |
+
evidence_df,
|
177 |
+
x=timestamp_col,
|
178 |
+
y=amount_col,
|
179 |
+
color=from_col,
|
180 |
+
title=f"Potential Wash Trading Between {addr1[:8]}... and {addr2[:8]}..."
|
181 |
+
)
|
182 |
+
else:
|
183 |
+
fig = None
|
184 |
+
else:
|
185 |
+
fig = None
|
186 |
+
|
187 |
+
wash_trades.append({
|
188 |
+
"type": "Wash Trading",
|
189 |
+
"addresses": [addr1, addr2],
|
190 |
+
"risk_level": "High" if cycles >= min_cycles * 2 else "Medium",
|
191 |
+
"description": f"Detected {cycles} cycles of back-and-forth transactions between addresses",
|
192 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
193 |
+
"title": f"Wash Trading Pattern ({cycles} cycles)",
|
194 |
+
"evidence": pd.DataFrame(evidence) if evidence else None,
|
195 |
+
"chart": fig
|
196 |
+
})
|
197 |
+
|
198 |
+
return wash_trades
|
199 |
+
|
200 |
+
def detect_pump_and_dump(self,
|
201 |
+
transactions_df: pd.DataFrame,
|
202 |
+
price_data: Dict[str, Dict[str, Any]],
|
203 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
204 |
+
"""
|
205 |
+
Detect potential pump and dump schemes
|
206 |
+
|
207 |
+
Args:
|
208 |
+
transactions_df: DataFrame of transactions
|
209 |
+
price_data: Dictionary of price impact data for each transaction
|
210 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
211 |
+
|
212 |
+
Returns:
|
213 |
+
List of potential pump and dump alerts
|
214 |
+
"""
|
215 |
+
if transactions_df.empty or not price_data:
|
216 |
+
return []
|
217 |
+
|
218 |
+
# Ensure timestamp column exists
|
219 |
+
if 'Timestamp' in transactions_df.columns:
|
220 |
+
timestamp_col = 'Timestamp'
|
221 |
+
elif 'timeStamp' in transactions_df.columns:
|
222 |
+
timestamp_col = 'timeStamp'
|
223 |
+
else:
|
224 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
225 |
+
|
226 |
+
# Ensure from/to columns exist
|
227 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
228 |
+
from_col, to_col = 'From', 'To'
|
229 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
230 |
+
from_col, to_col = 'from', 'to'
|
231 |
+
else:
|
232 |
+
raise ValueError("From/To columns not found in transactions DataFrame")
|
233 |
+
|
234 |
+
# Ensure timestamp is datetime
|
235 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
236 |
+
if isinstance(transactions_df[timestamp_col].iloc[0], (int, float)):
|
237 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col], unit='s')
|
238 |
+
else:
|
239 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
240 |
+
|
241 |
+
# Define sensitivity thresholds
|
242 |
+
if sensitivity == "Low":
|
243 |
+
accumulation_threshold = 5 # Number of buys to consider accumulation
|
244 |
+
pump_threshold = 10.0 # % price increase to trigger pump
|
245 |
+
dump_threshold = -8.0 # % price decrease to trigger dump
|
246 |
+
elif sensitivity == "Medium":
|
247 |
+
accumulation_threshold = 3
|
248 |
+
pump_threshold = 7.0
|
249 |
+
dump_threshold = -5.0
|
250 |
+
else: # High
|
251 |
+
accumulation_threshold = 2
|
252 |
+
pump_threshold = 5.0
|
253 |
+
dump_threshold = -3.0
|
254 |
+
|
255 |
+
# Combine price impact data with transactions
|
256 |
+
txs_with_impact = []
|
257 |
+
|
258 |
+
for idx, row in transactions_df.iterrows():
|
259 |
+
tx_hash = row.get('Transaction Hash', row.get('hash', None))
|
260 |
+
if not tx_hash or tx_hash not in price_data:
|
261 |
+
continue
|
262 |
+
|
263 |
+
tx_impact = price_data[tx_hash]
|
264 |
+
|
265 |
+
if tx_impact['impact_pct'] is None:
|
266 |
+
continue
|
267 |
+
|
268 |
+
txs_with_impact.append({
|
269 |
+
'transaction_hash': tx_hash,
|
270 |
+
'timestamp': row[timestamp_col],
|
271 |
+
'from': row[from_col],
|
272 |
+
'to': row[to_col],
|
273 |
+
'pre_price': tx_impact['pre_price'],
|
274 |
+
'post_price': tx_impact['post_price'],
|
275 |
+
'impact_pct': tx_impact['impact_pct']
|
276 |
+
})
|
277 |
+
|
278 |
+
if not txs_with_impact:
|
279 |
+
return []
|
280 |
+
|
281 |
+
impact_df = pd.DataFrame(txs_with_impact)
|
282 |
+
impact_df = impact_df.sort_values(by='timestamp')
|
283 |
+
|
284 |
+
# Look for accumulation phases followed by price pumps and then dumps
|
285 |
+
pump_and_dumps = []
|
286 |
+
|
287 |
+
# Group by address to analyze per wallet
|
288 |
+
address_groups = {}
|
289 |
+
|
290 |
+
for from_addr in impact_df['from'].unique():
|
291 |
+
address_groups[from_addr] = impact_df[impact_df['from'] == from_addr]
|
292 |
+
|
293 |
+
for to_addr in impact_df['to'].unique():
|
294 |
+
if to_addr in address_groups:
|
295 |
+
address_groups[to_addr] = pd.concat([
|
296 |
+
address_groups[to_addr],
|
297 |
+
impact_df[impact_df['to'] == to_addr]
|
298 |
+
])
|
299 |
+
else:
|
300 |
+
address_groups[to_addr] = impact_df[impact_df['to'] == to_addr]
|
301 |
+
|
302 |
+
for address, addr_df in address_groups.items():
|
303 |
+
# Skip if not enough transactions
|
304 |
+
if len(addr_df) < accumulation_threshold + 2:
|
305 |
+
continue
|
306 |
+
|
307 |
+
# Look for continuous price increase followed by sharp drop
|
308 |
+
window_size = min(len(addr_df), 10)
|
309 |
+
for i in range(len(addr_df) - window_size + 1):
|
310 |
+
window = addr_df.iloc[i:i+window_size]
|
311 |
+
|
312 |
+
# Get cumulative price change in window
|
313 |
+
if len(window) >= 2:
|
314 |
+
first_price = window.iloc[0]['pre_price']
|
315 |
+
last_price = window.iloc[-1]['post_price']
|
316 |
+
|
317 |
+
if first_price is None or last_price is None:
|
318 |
+
continue
|
319 |
+
|
320 |
+
cumulative_change = ((last_price - first_price) / first_price) * 100
|
321 |
+
|
322 |
+
# Check for pump phase
|
323 |
+
max_price = window['post_price'].max()
|
324 |
+
max_idx = window['post_price'].idxmax()
|
325 |
+
|
326 |
+
if max_idx < len(window) - 1:
|
327 |
+
max_to_end = ((window.iloc[-1]['post_price'] - max_price) / max_price) * 100
|
328 |
+
|
329 |
+
# If we have a pump followed by a dump
|
330 |
+
if (cumulative_change > pump_threshold or
|
331 |
+
any(window['impact_pct'] > pump_threshold)) and max_to_end < dump_threshold:
|
332 |
+
|
333 |
+
# Create chart
|
334 |
+
fig = go.Figure()
|
335 |
+
|
336 |
+
# Plot price line
|
337 |
+
times = [t.timestamp() for t in window['timestamp']]
|
338 |
+
prices = []
|
339 |
+
for _, row in window.iterrows():
|
340 |
+
prices.append(row['pre_price'])
|
341 |
+
prices.append(row['post_price'])
|
342 |
+
|
343 |
+
times_expanded = []
|
344 |
+
for t in times:
|
345 |
+
times_expanded.append(t - 60) # 1 min before
|
346 |
+
times_expanded.append(t + 60) # 1 min after
|
347 |
+
|
348 |
+
fig.add_trace(go.Scatter(
|
349 |
+
x=times_expanded,
|
350 |
+
y=prices,
|
351 |
+
mode='lines+markers',
|
352 |
+
name='Price',
|
353 |
+
line=dict(color='blue')
|
354 |
+
))
|
355 |
+
|
356 |
+
# Highlight pump and dump phases
|
357 |
+
max_time_idx = window.index.get_loc(max_idx)
|
358 |
+
pump_x = times_expanded[:max_time_idx*2+2]
|
359 |
+
pump_y = prices[:max_time_idx*2+2]
|
360 |
+
|
361 |
+
dump_x = times_expanded[max_time_idx*2:]
|
362 |
+
dump_y = prices[max_time_idx*2:]
|
363 |
+
|
364 |
+
fig.add_trace(go.Scatter(
|
365 |
+
x=pump_x,
|
366 |
+
y=pump_y,
|
367 |
+
mode='lines',
|
368 |
+
line=dict(color='green', width=3),
|
369 |
+
name='Pump Phase'
|
370 |
+
))
|
371 |
+
|
372 |
+
fig.add_trace(go.Scatter(
|
373 |
+
x=dump_x,
|
374 |
+
y=dump_y,
|
375 |
+
mode='lines',
|
376 |
+
line=dict(color='red', width=3),
|
377 |
+
name='Dump Phase'
|
378 |
+
))
|
379 |
+
|
380 |
+
fig.update_layout(
|
381 |
+
title='Potential Pump and Dump Pattern',
|
382 |
+
xaxis_title='Time',
|
383 |
+
yaxis_title='Price',
|
384 |
+
hovermode='closest'
|
385 |
+
)
|
386 |
+
|
387 |
+
pump_and_dumps.append({
|
388 |
+
"type": "Pump and Dump",
|
389 |
+
"addresses": [address],
|
390 |
+
"risk_level": "High" if max_to_end < dump_threshold * 1.5 else "Medium",
|
391 |
+
"description": f"Price pumped {cumulative_change:.2f}% before dropping {max_to_end:.2f}%",
|
392 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
393 |
+
"title": f"Pump ({cumulative_change:.1f}%) and Dump ({max_to_end:.1f}%)",
|
394 |
+
"evidence": window,
|
395 |
+
"chart": fig
|
396 |
+
})
|
397 |
+
|
398 |
+
return pump_and_dumps
|
399 |
+
|
400 |
+
def detect_spoofing(self,
|
401 |
+
transactions_df: pd.DataFrame,
|
402 |
+
order_book_data: Optional[pd.DataFrame] = None,
|
403 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
404 |
+
"""
|
405 |
+
Detect potential spoofing (placing and quickly canceling large orders)
|
406 |
+
|
407 |
+
Args:
|
408 |
+
transactions_df: DataFrame of transactions
|
409 |
+
order_book_data: Optional DataFrame of order book data
|
410 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
411 |
+
|
412 |
+
Returns:
|
413 |
+
List of potential spoofing alerts
|
414 |
+
"""
|
415 |
+
# Note: This is a placeholder since we don't have direct order book data
|
416 |
+
# In a real implementation, this would analyze order placement and cancellations
|
417 |
+
|
418 |
+
# For now, return an empty list as we can't detect spoofing without order book data
|
419 |
+
return []
|
420 |
+
|
421 |
+
def detect_layering(self,
|
422 |
+
transactions_df: pd.DataFrame,
|
423 |
+
order_book_data: Optional[pd.DataFrame] = None,
|
424 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
425 |
+
"""
|
426 |
+
Detect potential layering (placing multiple orders at different price levels)
|
427 |
+
|
428 |
+
Args:
|
429 |
+
transactions_df: DataFrame of transactions
|
430 |
+
order_book_data: Optional DataFrame of order book data
|
431 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
432 |
+
|
433 |
+
Returns:
|
434 |
+
List of potential layering alerts
|
435 |
+
"""
|
436 |
+
# Note: This is a placeholder since we don't have direct order book data
|
437 |
+
# In a real implementation, this would analyze order book depth and patterns
|
438 |
+
|
439 |
+
# For now, return an empty list as we can't detect layering without order book data
|
440 |
+
return []
|
441 |
+
|
442 |
+
def detect_momentum_ignition(self,
|
443 |
+
transactions_df: pd.DataFrame,
|
444 |
+
price_data: Dict[str, Dict[str, Any]],
|
445 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
446 |
+
"""
|
447 |
+
Detect potential momentum ignition (creating sharp price moves)
|
448 |
+
|
449 |
+
Args:
|
450 |
+
transactions_df: DataFrame of transactions
|
451 |
+
price_data: Dictionary of price impact data for each transaction
|
452 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
453 |
+
|
454 |
+
Returns:
|
455 |
+
List of potential momentum ignition alerts
|
456 |
+
"""
|
457 |
+
if transactions_df.empty or not price_data:
|
458 |
+
return []
|
459 |
+
|
460 |
+
# Ensure timestamp column exists
|
461 |
+
if 'Timestamp' in transactions_df.columns:
|
462 |
+
timestamp_col = 'Timestamp'
|
463 |
+
elif 'timeStamp' in transactions_df.columns:
|
464 |
+
timestamp_col = 'timeStamp'
|
465 |
+
else:
|
466 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
467 |
+
|
468 |
+
# Ensure timestamp is datetime
|
469 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
470 |
+
if isinstance(transactions_df[timestamp_col].iloc[0], (int, float)):
|
471 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col], unit='s')
|
472 |
+
else:
|
473 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
474 |
+
|
475 |
+
# Define sensitivity thresholds
|
476 |
+
if sensitivity == "Low":
|
477 |
+
impact_threshold = 15.0 # % price impact to trigger alert
|
478 |
+
time_window_minutes = 5 # Time window to look for follow-up transactions
|
479 |
+
elif sensitivity == "Medium":
|
480 |
+
impact_threshold = 10.0
|
481 |
+
time_window_minutes = 10
|
482 |
+
else: # High
|
483 |
+
impact_threshold = 5.0
|
484 |
+
time_window_minutes = 15
|
485 |
+
|
486 |
+
# Combine price impact data with transactions
|
487 |
+
txs_with_impact = []
|
488 |
+
|
489 |
+
for idx, row in transactions_df.iterrows():
|
490 |
+
tx_hash = row.get('Transaction Hash', row.get('hash', None))
|
491 |
+
if not tx_hash or tx_hash not in price_data:
|
492 |
+
continue
|
493 |
+
|
494 |
+
tx_impact = price_data[tx_hash]
|
495 |
+
|
496 |
+
if tx_impact['impact_pct'] is None:
|
497 |
+
continue
|
498 |
+
|
499 |
+
txs_with_impact.append({
|
500 |
+
'transaction_hash': tx_hash,
|
501 |
+
'timestamp': row[timestamp_col],
|
502 |
+
'from': row.get('From', row.get('from', 'Unknown')),
|
503 |
+
'to': row.get('To', row.get('to', 'Unknown')),
|
504 |
+
'pre_price': tx_impact['pre_price'],
|
505 |
+
'post_price': tx_impact['post_price'],
|
506 |
+
'impact_pct': tx_impact['impact_pct']
|
507 |
+
})
|
508 |
+
|
509 |
+
if not txs_with_impact:
|
510 |
+
return []
|
511 |
+
|
512 |
+
impact_df = pd.DataFrame(txs_with_impact)
|
513 |
+
impact_df = impact_df.sort_values(by='timestamp')
|
514 |
+
|
515 |
+
# Look for large price impacts followed by increased trading activity
|
516 |
+
momentum_alerts = []
|
517 |
+
|
518 |
+
# Find high-impact transactions
|
519 |
+
high_impact_txs = impact_df[abs(impact_df['impact_pct']) > impact_threshold]
|
520 |
+
|
521 |
+
for idx, high_impact_tx in high_impact_txs.iterrows():
|
522 |
+
tx_time = high_impact_tx['timestamp']
|
523 |
+
|
524 |
+
# Look for increased trading activity after the high-impact transaction
|
525 |
+
follow_up_window = impact_df[
|
526 |
+
(impact_df['timestamp'] > tx_time) &
|
527 |
+
(impact_df['timestamp'] <= tx_time + pd.Timedelta(minutes=time_window_minutes))
|
528 |
+
]
|
529 |
+
|
530 |
+
# Compare activity to baseline (same time window before the transaction)
|
531 |
+
baseline_window = impact_df[
|
532 |
+
(impact_df['timestamp'] < tx_time) &
|
533 |
+
(impact_df['timestamp'] >= tx_time - pd.Timedelta(minutes=time_window_minutes))
|
534 |
+
]
|
535 |
+
|
536 |
+
if len(follow_up_window) > len(baseline_window) * 1.5 and len(follow_up_window) >= 3:
|
537 |
+
# Create chart
|
538 |
+
fig = go.Figure()
|
539 |
+
|
540 |
+
# Plot price timeline
|
541 |
+
all_relevant_txs = pd.concat([
|
542 |
+
pd.DataFrame([high_impact_tx]),
|
543 |
+
follow_up_window,
|
544 |
+
baseline_window
|
545 |
+
]).sort_values(by='timestamp')
|
546 |
+
|
547 |
+
# Create time series for price
|
548 |
+
timestamps = all_relevant_txs['timestamp']
|
549 |
+
prices = []
|
550 |
+
for _, row in all_relevant_txs.iterrows():
|
551 |
+
prices.append(row['pre_price'])
|
552 |
+
prices.append(row['post_price'])
|
553 |
+
|
554 |
+
times_expanded = []
|
555 |
+
for t in timestamps:
|
556 |
+
times_expanded.append(t - pd.Timedelta(seconds=30))
|
557 |
+
times_expanded.append(t + pd.Timedelta(seconds=30))
|
558 |
+
|
559 |
+
# Plot price line
|
560 |
+
fig.add_trace(go.Scatter(
|
561 |
+
x=times_expanded[:len(prices)], # In case of any length mismatch
|
562 |
+
y=prices[:len(times_expanded)],
|
563 |
+
mode='lines',
|
564 |
+
name='Price'
|
565 |
+
))
|
566 |
+
|
567 |
+
# Highlight the high-impact transaction
|
568 |
+
fig.add_trace(go.Scatter(
|
569 |
+
x=[high_impact_tx['timestamp']],
|
570 |
+
y=[high_impact_tx['post_price']],
|
571 |
+
mode='markers',
|
572 |
+
marker=dict(
|
573 |
+
size=15,
|
574 |
+
color='red',
|
575 |
+
symbol='circle'
|
576 |
+
),
|
577 |
+
name='Momentum Ignition'
|
578 |
+
))
|
579 |
+
|
580 |
+
# Highlight the follow-up transactions
|
581 |
+
if not follow_up_window.empty:
|
582 |
+
fig.add_trace(go.Scatter(
|
583 |
+
x=follow_up_window['timestamp'],
|
584 |
+
y=follow_up_window['post_price'],
|
585 |
+
mode='markers',
|
586 |
+
marker=dict(
|
587 |
+
size=10,
|
588 |
+
color='orange',
|
589 |
+
symbol='circle'
|
590 |
+
),
|
591 |
+
name='Follow-up Activity'
|
592 |
+
))
|
593 |
+
|
594 |
+
fig.update_layout(
|
595 |
+
title='Potential Momentum Ignition Pattern',
|
596 |
+
xaxis_title='Time',
|
597 |
+
yaxis_title='Price',
|
598 |
+
hovermode='closest'
|
599 |
+
)
|
600 |
+
|
601 |
+
momentum_alerts.append({
|
602 |
+
"type": "Momentum Ignition",
|
603 |
+
"addresses": [high_impact_tx['from']],
|
604 |
+
"risk_level": "High" if abs(high_impact_tx['impact_pct']) > impact_threshold * 1.5 else "Medium",
|
605 |
+
"description": f"Large {high_impact_tx['impact_pct']:.2f}% price move followed by {len(follow_up_window)} transactions in {time_window_minutes} minutes (vs {len(baseline_window)} in baseline)",
|
606 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
607 |
+
"title": f"Momentum Ignition ({high_impact_tx['impact_pct']:.1f}% price move)",
|
608 |
+
"evidence": pd.concat([pd.DataFrame([high_impact_tx]), follow_up_window]),
|
609 |
+
"chart": fig
|
610 |
+
})
|
611 |
+
|
612 |
+
return momentum_alerts
|
613 |
+
|
614 |
+
def run_all_detections(self,
|
615 |
+
transactions_df: pd.DataFrame,
|
616 |
+
addresses: List[str],
|
617 |
+
price_data: Dict[str, Dict[str, Any]] = None,
|
618 |
+
order_book_data: Optional[pd.DataFrame] = None,
|
619 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
620 |
+
"""
|
621 |
+
Run all manipulation detection algorithms
|
622 |
+
|
623 |
+
Args:
|
624 |
+
transactions_df: DataFrame of transactions
|
625 |
+
addresses: List of addresses to analyze
|
626 |
+
price_data: Optional dictionary of price impact data for each transaction
|
627 |
+
order_book_data: Optional DataFrame of order book data
|
628 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
629 |
+
|
630 |
+
Returns:
|
631 |
+
List of potential manipulation alerts
|
632 |
+
"""
|
633 |
+
if transactions_df.empty:
|
634 |
+
return []
|
635 |
+
|
636 |
+
all_alerts = []
|
637 |
+
|
638 |
+
# Detect wash trading
|
639 |
+
wash_trading_alerts = self.detect_wash_trading(
|
640 |
+
transactions_df=transactions_df,
|
641 |
+
addresses=addresses,
|
642 |
+
sensitivity=sensitivity
|
643 |
+
)
|
644 |
+
all_alerts.extend(wash_trading_alerts)
|
645 |
+
|
646 |
+
# Detect pump and dump (if price data available)
|
647 |
+
if price_data:
|
648 |
+
pump_and_dump_alerts = self.detect_pump_and_dump(
|
649 |
+
transactions_df=transactions_df,
|
650 |
+
price_data=price_data,
|
651 |
+
sensitivity=sensitivity
|
652 |
+
)
|
653 |
+
all_alerts.extend(pump_and_dump_alerts)
|
654 |
+
|
655 |
+
# Detect momentum ignition (if price data available)
|
656 |
+
momentum_alerts = self.detect_momentum_ignition(
|
657 |
+
transactions_df=transactions_df,
|
658 |
+
price_data=price_data,
|
659 |
+
sensitivity=sensitivity
|
660 |
+
)
|
661 |
+
all_alerts.extend(momentum_alerts)
|
662 |
+
|
663 |
+
# Detect spoofing (if order book data available)
|
664 |
+
if order_book_data is not None:
|
665 |
+
spoofing_alerts = self.detect_spoofing(
|
666 |
+
transactions_df=transactions_df,
|
667 |
+
order_book_data=order_book_data,
|
668 |
+
sensitivity=sensitivity
|
669 |
+
)
|
670 |
+
all_alerts.extend(spoofing_alerts)
|
671 |
+
|
672 |
+
# Detect layering (if order book data available)
|
673 |
+
layering_alerts = self.detect_layering(
|
674 |
+
transactions_df=transactions_df,
|
675 |
+
order_book_data=order_book_data,
|
676 |
+
sensitivity=sensitivity
|
677 |
+
)
|
678 |
+
all_alerts.extend(layering_alerts)
|
679 |
+
|
680 |
+
# Sort alerts by risk level
|
681 |
+
risk_order = {"High": 0, "Medium": 1, "Low": 2}
|
682 |
+
all_alerts.sort(key=lambda x: risk_order.get(x.get("risk_level", "Low"), 3))
|
683 |
+
|
684 |
+
return all_alerts
|
modules/tools.py
ADDED
@@ -0,0 +1,373 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import json
|
2 |
+
import pandas as pd
|
3 |
+
from datetime import datetime
|
4 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
5 |
+
|
6 |
+
from langchain.tools import tool
|
7 |
+
from modules.api_client import ArbiscanClient, GeminiClient
|
8 |
+
from modules.data_processor import DataProcessor
|
9 |
+
|
10 |
+
# Tools for Arbiscan API
|
11 |
+
class ArbiscanTools:
|
12 |
+
def __init__(self, arbiscan_client: ArbiscanClient):
|
13 |
+
self.client = arbiscan_client
|
14 |
+
|
15 |
+
@tool("get_token_transfers")
|
16 |
+
def get_token_transfers(self, address: str, contract_address: Optional[str] = None) -> str:
|
17 |
+
"""
|
18 |
+
Get ERC-20 token transfers for a specific address
|
19 |
+
|
20 |
+
Args:
|
21 |
+
address: Wallet address
|
22 |
+
contract_address: Optional token contract address to filter by
|
23 |
+
|
24 |
+
Returns:
|
25 |
+
List of token transfers as JSON string
|
26 |
+
"""
|
27 |
+
transfers = self.client.get_token_transfers(
|
28 |
+
address=address,
|
29 |
+
contract_address=contract_address
|
30 |
+
)
|
31 |
+
return json.dumps(transfers)
|
32 |
+
|
33 |
+
@tool("get_token_balance")
|
34 |
+
def get_token_balance(self, address: str, contract_address: str) -> str:
|
35 |
+
"""
|
36 |
+
Get the current balance of a specific token for an address
|
37 |
+
|
38 |
+
Args:
|
39 |
+
address: Wallet address
|
40 |
+
contract_address: Token contract address
|
41 |
+
|
42 |
+
Returns:
|
43 |
+
Token balance
|
44 |
+
"""
|
45 |
+
balance = self.client.get_token_balance(
|
46 |
+
address=address,
|
47 |
+
contract_address=contract_address
|
48 |
+
)
|
49 |
+
return balance
|
50 |
+
|
51 |
+
@tool("get_normal_transactions")
|
52 |
+
def get_normal_transactions(self, address: str) -> str:
|
53 |
+
"""
|
54 |
+
Get normal transactions (ETH/ARB transfers) for a specific address
|
55 |
+
|
56 |
+
Args:
|
57 |
+
address: Wallet address
|
58 |
+
|
59 |
+
Returns:
|
60 |
+
List of normal transactions as JSON string
|
61 |
+
"""
|
62 |
+
transactions = self.client.get_normal_transactions(address=address)
|
63 |
+
return json.dumps(transactions)
|
64 |
+
|
65 |
+
@tool("get_internal_transactions")
|
66 |
+
def get_internal_transactions(self, address: str) -> str:
|
67 |
+
"""
|
68 |
+
Get internal transactions for a specific address
|
69 |
+
|
70 |
+
Args:
|
71 |
+
address: Wallet address
|
72 |
+
|
73 |
+
Returns:
|
74 |
+
List of internal transactions as JSON string
|
75 |
+
"""
|
76 |
+
transactions = self.client.get_internal_transactions(address=address)
|
77 |
+
return json.dumps(transactions)
|
78 |
+
|
79 |
+
@tool("fetch_whale_transactions")
|
80 |
+
def fetch_whale_transactions(self,
|
81 |
+
addresses: List[str],
|
82 |
+
token_address: Optional[str] = None,
|
83 |
+
min_token_amount: Optional[float] = None,
|
84 |
+
min_usd_value: Optional[float] = None) -> str:
|
85 |
+
"""
|
86 |
+
Fetch whale transactions for a list of addresses
|
87 |
+
|
88 |
+
Args:
|
89 |
+
addresses: List of wallet addresses
|
90 |
+
token_address: Optional token contract address to filter by
|
91 |
+
min_token_amount: Minimum token amount
|
92 |
+
min_usd_value: Minimum USD value
|
93 |
+
|
94 |
+
Returns:
|
95 |
+
DataFrame of whale transactions as JSON string
|
96 |
+
"""
|
97 |
+
transactions_df = self.client.fetch_whale_transactions(
|
98 |
+
addresses=addresses,
|
99 |
+
token_address=token_address,
|
100 |
+
min_token_amount=min_token_amount,
|
101 |
+
min_usd_value=min_usd_value
|
102 |
+
)
|
103 |
+
return transactions_df.to_json(orient="records")
|
104 |
+
|
105 |
+
|
106 |
+
# Tools for Gemini API
|
107 |
+
class GeminiTools:
|
108 |
+
def __init__(self, gemini_client: GeminiClient):
|
109 |
+
self.client = gemini_client
|
110 |
+
|
111 |
+
@tool("get_current_price")
|
112 |
+
def get_current_price(self, symbol: str) -> str:
|
113 |
+
"""
|
114 |
+
Get the current price of a token
|
115 |
+
|
116 |
+
Args:
|
117 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
118 |
+
|
119 |
+
Returns:
|
120 |
+
Current price
|
121 |
+
"""
|
122 |
+
price = self.client.get_current_price(symbol=symbol)
|
123 |
+
return str(price) if price is not None else "Price not found"
|
124 |
+
|
125 |
+
@tool("get_historical_prices")
|
126 |
+
def get_historical_prices(self,
|
127 |
+
symbol: str,
|
128 |
+
start_time: str,
|
129 |
+
end_time: str) -> str:
|
130 |
+
"""
|
131 |
+
Get historical prices for a token within a time range
|
132 |
+
|
133 |
+
Args:
|
134 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
135 |
+
start_time: Start datetime in ISO format
|
136 |
+
end_time: End datetime in ISO format
|
137 |
+
|
138 |
+
Returns:
|
139 |
+
DataFrame of historical prices as JSON string
|
140 |
+
"""
|
141 |
+
# Parse datetime strings
|
142 |
+
start_time_dt = datetime.fromisoformat(start_time.replace('Z', '+00:00'))
|
143 |
+
end_time_dt = datetime.fromisoformat(end_time.replace('Z', '+00:00'))
|
144 |
+
|
145 |
+
prices_df = self.client.get_historical_prices(
|
146 |
+
symbol=symbol,
|
147 |
+
start_time=start_time_dt,
|
148 |
+
end_time=end_time_dt
|
149 |
+
)
|
150 |
+
|
151 |
+
if prices_df is not None:
|
152 |
+
return prices_df.to_json(orient="records")
|
153 |
+
else:
|
154 |
+
return "[]"
|
155 |
+
|
156 |
+
@tool("get_price_impact")
|
157 |
+
def get_price_impact(self,
|
158 |
+
symbol: str,
|
159 |
+
transaction_time: str,
|
160 |
+
lookback_minutes: int = 5,
|
161 |
+
lookahead_minutes: int = 5) -> str:
|
162 |
+
"""
|
163 |
+
Analyze the price impact before and after a transaction
|
164 |
+
|
165 |
+
Args:
|
166 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
167 |
+
transaction_time: Transaction datetime in ISO format
|
168 |
+
lookback_minutes: Minutes to look back before the transaction
|
169 |
+
lookahead_minutes: Minutes to look ahead after the transaction
|
170 |
+
|
171 |
+
Returns:
|
172 |
+
Price impact data as JSON string
|
173 |
+
"""
|
174 |
+
# Parse datetime string
|
175 |
+
transaction_time_dt = datetime.fromisoformat(transaction_time.replace('Z', '+00:00'))
|
176 |
+
|
177 |
+
impact_data = self.client.get_price_impact(
|
178 |
+
symbol=symbol,
|
179 |
+
transaction_time=transaction_time_dt,
|
180 |
+
lookback_minutes=lookback_minutes,
|
181 |
+
lookahead_minutes=lookahead_minutes
|
182 |
+
)
|
183 |
+
|
184 |
+
# Convert to JSON string
|
185 |
+
result = {
|
186 |
+
"pre_price": impact_data["pre_price"],
|
187 |
+
"post_price": impact_data["post_price"],
|
188 |
+
"impact_pct": impact_data["impact_pct"]
|
189 |
+
}
|
190 |
+
return json.dumps(result)
|
191 |
+
|
192 |
+
|
193 |
+
# Tools for Data Processor
|
194 |
+
class DataProcessorTools:
|
195 |
+
def __init__(self, data_processor: DataProcessor):
|
196 |
+
self.processor = data_processor
|
197 |
+
|
198 |
+
@tool("aggregate_transactions")
|
199 |
+
def aggregate_transactions(self,
|
200 |
+
transactions_json: str,
|
201 |
+
time_window: str = 'D') -> str:
|
202 |
+
"""
|
203 |
+
Aggregate transactions by time window
|
204 |
+
|
205 |
+
Args:
|
206 |
+
transactions_json: JSON string of transactions
|
207 |
+
time_window: Time window for aggregation (e.g., 'D' for day, 'H' for hour)
|
208 |
+
|
209 |
+
Returns:
|
210 |
+
Aggregated DataFrame as JSON string
|
211 |
+
"""
|
212 |
+
# Convert JSON to DataFrame
|
213 |
+
transactions_df = pd.read_json(transactions_json)
|
214 |
+
|
215 |
+
# Process data
|
216 |
+
agg_df = self.processor.aggregate_transactions(
|
217 |
+
transactions_df=transactions_df,
|
218 |
+
time_window=time_window
|
219 |
+
)
|
220 |
+
|
221 |
+
# Convert result to JSON
|
222 |
+
return agg_df.to_json(orient="records")
|
223 |
+
|
224 |
+
@tool("identify_patterns")
|
225 |
+
def identify_patterns(self,
|
226 |
+
transactions_json: str,
|
227 |
+
n_clusters: int = 3) -> str:
|
228 |
+
"""
|
229 |
+
Identify trading patterns using clustering
|
230 |
+
|
231 |
+
Args:
|
232 |
+
transactions_json: JSON string of transactions
|
233 |
+
n_clusters: Number of clusters for K-Means
|
234 |
+
|
235 |
+
Returns:
|
236 |
+
List of pattern dictionaries as JSON string
|
237 |
+
"""
|
238 |
+
# Convert JSON to DataFrame
|
239 |
+
transactions_df = pd.read_json(transactions_json)
|
240 |
+
|
241 |
+
# Process data
|
242 |
+
patterns = self.processor.identify_patterns(
|
243 |
+
transactions_df=transactions_df,
|
244 |
+
n_clusters=n_clusters
|
245 |
+
)
|
246 |
+
|
247 |
+
# Convert result to JSON
|
248 |
+
result = []
|
249 |
+
for pattern in patterns:
|
250 |
+
# Convert non-serializable objects to serializable format
|
251 |
+
pattern_json = {
|
252 |
+
"name": pattern["name"],
|
253 |
+
"description": pattern["description"],
|
254 |
+
"cluster_id": pattern["cluster_id"],
|
255 |
+
"occurrence_count": pattern["occurrence_count"],
|
256 |
+
"confidence": pattern["confidence"],
|
257 |
+
# Skip chart_data as it's not JSON serializable
|
258 |
+
"examples": pattern["examples"].to_json(orient="records") if isinstance(pattern["examples"], pd.DataFrame) else []
|
259 |
+
}
|
260 |
+
result.append(pattern_json)
|
261 |
+
|
262 |
+
return json.dumps(result)
|
263 |
+
|
264 |
+
@tool("detect_anomalous_transactions")
|
265 |
+
def detect_anomalous_transactions(self,
|
266 |
+
transactions_json: str,
|
267 |
+
sensitivity: str = "Medium") -> str:
|
268 |
+
"""
|
269 |
+
Detect anomalous transactions using statistical methods
|
270 |
+
|
271 |
+
Args:
|
272 |
+
transactions_json: JSON string of transactions
|
273 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
274 |
+
|
275 |
+
Returns:
|
276 |
+
DataFrame of anomalous transactions as JSON string
|
277 |
+
"""
|
278 |
+
# Convert JSON to DataFrame
|
279 |
+
transactions_df = pd.read_json(transactions_json)
|
280 |
+
|
281 |
+
# Process data
|
282 |
+
anomalies_df = self.processor.detect_anomalous_transactions(
|
283 |
+
transactions_df=transactions_df,
|
284 |
+
sensitivity=sensitivity
|
285 |
+
)
|
286 |
+
|
287 |
+
# Convert result to JSON
|
288 |
+
return anomalies_df.to_json(orient="records")
|
289 |
+
|
290 |
+
@tool("analyze_price_impact")
|
291 |
+
def analyze_price_impact(self,
|
292 |
+
transactions_json: str,
|
293 |
+
price_data_json: str) -> str:
|
294 |
+
"""
|
295 |
+
Analyze the price impact of transactions
|
296 |
+
|
297 |
+
Args:
|
298 |
+
transactions_json: JSON string of transactions
|
299 |
+
price_data_json: JSON string of price impact data
|
300 |
+
|
301 |
+
Returns:
|
302 |
+
Price impact analysis as JSON string
|
303 |
+
"""
|
304 |
+
# Convert JSON to DataFrame
|
305 |
+
transactions_df = pd.read_json(transactions_json)
|
306 |
+
|
307 |
+
# Convert price_data_json to dictionary
|
308 |
+
price_data = json.loads(price_data_json)
|
309 |
+
|
310 |
+
# Process data
|
311 |
+
impact_analysis = self.processor.analyze_price_impact(
|
312 |
+
transactions_df=transactions_df,
|
313 |
+
price_data=price_data
|
314 |
+
)
|
315 |
+
|
316 |
+
# Convert result to JSON (excluding non-serializable objects)
|
317 |
+
result = {
|
318 |
+
"avg_impact_pct": impact_analysis.get("avg_impact_pct"),
|
319 |
+
"max_impact_pct": impact_analysis.get("max_impact_pct"),
|
320 |
+
"min_impact_pct": impact_analysis.get("min_impact_pct"),
|
321 |
+
"significant_moves_count": impact_analysis.get("significant_moves_count"),
|
322 |
+
"total_transactions": impact_analysis.get("total_transactions"),
|
323 |
+
# Skip impact_chart as it's not JSON serializable
|
324 |
+
"transactions_with_impact": impact_analysis.get("transactions_with_impact").to_json(orient="records") if "transactions_with_impact" in impact_analysis else []
|
325 |
+
}
|
326 |
+
|
327 |
+
return json.dumps(result)
|
328 |
+
|
329 |
+
@tool("detect_wash_trading")
|
330 |
+
def detect_wash_trading(self,
|
331 |
+
transactions_json: str,
|
332 |
+
addresses_json: str,
|
333 |
+
sensitivity: str = "Medium") -> str:
|
334 |
+
"""
|
335 |
+
Detect potential wash trading between addresses
|
336 |
+
|
337 |
+
Args:
|
338 |
+
transactions_json: JSON string of transactions
|
339 |
+
addresses_json: JSON string of addresses to analyze
|
340 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
341 |
+
|
342 |
+
Returns:
|
343 |
+
List of potential wash trading incidents as JSON string
|
344 |
+
"""
|
345 |
+
# Convert JSON to DataFrame
|
346 |
+
transactions_df = pd.read_json(transactions_json)
|
347 |
+
|
348 |
+
# Convert addresses_json to list
|
349 |
+
addresses = json.loads(addresses_json)
|
350 |
+
|
351 |
+
# Process data
|
352 |
+
wash_trades = self.processor.detect_wash_trading(
|
353 |
+
transactions_df=transactions_df,
|
354 |
+
addresses=addresses,
|
355 |
+
sensitivity=sensitivity
|
356 |
+
)
|
357 |
+
|
358 |
+
# Convert result to JSON (excluding non-serializable objects)
|
359 |
+
result = []
|
360 |
+
for trade in wash_trades:
|
361 |
+
trade_json = {
|
362 |
+
"type": trade["type"],
|
363 |
+
"addresses": trade["addresses"],
|
364 |
+
"risk_level": trade["risk_level"],
|
365 |
+
"description": trade["description"],
|
366 |
+
"detection_time": trade["detection_time"],
|
367 |
+
"title": trade["title"],
|
368 |
+
"evidence": trade["evidence"].to_json(orient="records") if isinstance(trade["evidence"], pd.DataFrame) else []
|
369 |
+
# Skip chart as it's not JSON serializable
|
370 |
+
}
|
371 |
+
result.append(trade_json)
|
372 |
+
|
373 |
+
return json.dumps(result)
|
modules/visualizer.py
ADDED
@@ -0,0 +1,638 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import pandas as pd
|
2 |
+
import numpy as np
|
3 |
+
import plotly.graph_objects as go
|
4 |
+
import plotly.express as px
|
5 |
+
from datetime import datetime, timedelta
|
6 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
7 |
+
import io
|
8 |
+
import base64
|
9 |
+
import matplotlib.pyplot as plt
|
10 |
+
from matplotlib.backends.backend_pdf import PdfPages
|
11 |
+
from reportlab.lib.pagesizes import letter
|
12 |
+
from reportlab.pdfgen import canvas
|
13 |
+
from reportlab.lib import colors
|
14 |
+
from reportlab.platypus import SimpleDocTemplate, Table, TableStyle, Paragraph, Spacer
|
15 |
+
from reportlab.lib.styles import getSampleStyleSheet
|
16 |
+
|
17 |
+
|
18 |
+
class Visualizer:
|
19 |
+
"""
|
20 |
+
Generate visualizations and reports for whale transaction data
|
21 |
+
"""
|
22 |
+
|
23 |
+
def __init__(self):
|
24 |
+
self.color_map = {
|
25 |
+
"buy": "green",
|
26 |
+
"sell": "red",
|
27 |
+
"transfer": "blue",
|
28 |
+
"other": "gray"
|
29 |
+
}
|
30 |
+
|
31 |
+
def create_transaction_timeline(self, transactions_df: pd.DataFrame) -> go.Figure:
|
32 |
+
"""
|
33 |
+
Create a timeline visualization of transactions
|
34 |
+
|
35 |
+
Args:
|
36 |
+
transactions_df: DataFrame of transactions
|
37 |
+
|
38 |
+
Returns:
|
39 |
+
Plotly figure object
|
40 |
+
"""
|
41 |
+
if transactions_df.empty:
|
42 |
+
fig = go.Figure()
|
43 |
+
fig.update_layout(
|
44 |
+
title="No Transaction Data Available",
|
45 |
+
xaxis_title="Date",
|
46 |
+
yaxis_title="Action",
|
47 |
+
height=400,
|
48 |
+
template="plotly_white"
|
49 |
+
)
|
50 |
+
fig.add_annotation(
|
51 |
+
text="No transaction data available for timeline",
|
52 |
+
showarrow=False,
|
53 |
+
font=dict(size=14)
|
54 |
+
)
|
55 |
+
return fig
|
56 |
+
|
57 |
+
try:
|
58 |
+
# Ensure timestamp column exists
|
59 |
+
if 'Timestamp' in transactions_df.columns:
|
60 |
+
timestamp_col = 'Timestamp'
|
61 |
+
elif 'timeStamp' in transactions_df.columns:
|
62 |
+
timestamp_col = 'timeStamp'
|
63 |
+
# Convert timestamp to datetime if it's not already
|
64 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
65 |
+
try:
|
66 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col].astype(float), unit='s')
|
67 |
+
except Exception as e:
|
68 |
+
print(f"Error converting timestamp: {str(e)}")
|
69 |
+
transactions_df[timestamp_col] = pd.date_range(start='2025-01-01', periods=len(transactions_df), freq='H')
|
70 |
+
else:
|
71 |
+
# Create a dummy timestamp if none exists
|
72 |
+
transactions_df['dummy_timestamp'] = pd.date_range(start='2025-01-01', periods=len(transactions_df), freq='H')
|
73 |
+
timestamp_col = 'dummy_timestamp'
|
74 |
+
|
75 |
+
# Create figure
|
76 |
+
fig = go.Figure()
|
77 |
+
|
78 |
+
# Add transactions to timeline
|
79 |
+
for idx, row in transactions_df.iterrows():
|
80 |
+
# Determine transaction type
|
81 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
82 |
+
from_col, to_col = 'From', 'To'
|
83 |
+
else:
|
84 |
+
from_col, to_col = 'from', 'to'
|
85 |
+
|
86 |
+
tx_type = "other"
|
87 |
+
hover_text = ""
|
88 |
+
|
89 |
+
if pd.isna(row[from_col]) or row[from_col] == '0x0000000000000000000000000000000000000000':
|
90 |
+
tx_type = "buy"
|
91 |
+
hover_text = f"Buy: {row[to_col]}"
|
92 |
+
elif pd.isna(row[to_col]) or row[to_col] == '0x0000000000000000000000000000000000000000':
|
93 |
+
tx_type = "sell"
|
94 |
+
hover_text = f"Sell: {row[from_col]}"
|
95 |
+
else:
|
96 |
+
tx_type = "transfer"
|
97 |
+
hover_text = f"Transfer: {row[from_col]} β {row[to_col]}"
|
98 |
+
|
99 |
+
# Add amount to hover text if available
|
100 |
+
if 'Amount' in row:
|
101 |
+
hover_text += f"<br>Amount: {row['Amount']}"
|
102 |
+
elif 'value' in row:
|
103 |
+
hover_text += f"<br>Value: {row['value']}"
|
104 |
+
|
105 |
+
# Add token info if available
|
106 |
+
if 'tokenSymbol' in row:
|
107 |
+
hover_text += f"<br>Token: {row['tokenSymbol']}"
|
108 |
+
|
109 |
+
# Add transaction to timeline
|
110 |
+
fig.add_trace(go.Scatter(
|
111 |
+
x=[row[timestamp_col]],
|
112 |
+
y=[tx_type],
|
113 |
+
mode='markers',
|
114 |
+
marker=dict(
|
115 |
+
size=12,
|
116 |
+
color=self.color_map.get(tx_type, "gray"),
|
117 |
+
line=dict(width=1, color='black')
|
118 |
+
),
|
119 |
+
name=tx_type,
|
120 |
+
text=hover_text,
|
121 |
+
hoverinfo='text'
|
122 |
+
))
|
123 |
+
|
124 |
+
# Update layout
|
125 |
+
fig.update_layout(
|
126 |
+
title='Whale Transaction Timeline',
|
127 |
+
xaxis_title='Time',
|
128 |
+
yaxis_title='Transaction Type',
|
129 |
+
height=400,
|
130 |
+
template='plotly_white',
|
131 |
+
showlegend=True,
|
132 |
+
hovermode='closest'
|
133 |
+
)
|
134 |
+
|
135 |
+
return fig
|
136 |
+
|
137 |
+
except Exception as e:
|
138 |
+
# If any error occurs, return a figure with error information
|
139 |
+
print(f"Error creating transaction timeline: {str(e)}")
|
140 |
+
fig = go.Figure()
|
141 |
+
fig.update_layout(
|
142 |
+
title="Error in Transaction Timeline",
|
143 |
+
xaxis_title="",
|
144 |
+
yaxis_title="",
|
145 |
+
height=400,
|
146 |
+
template="plotly_white"
|
147 |
+
)
|
148 |
+
fig.add_annotation(
|
149 |
+
text=f"Error generating timeline: {str(e)}",
|
150 |
+
showarrow=False,
|
151 |
+
font=dict(size=14, color="red")
|
152 |
+
)
|
153 |
+
return fig
|
154 |
+
|
155 |
+
def create_volume_chart(self, transactions_df: pd.DataFrame, time_window: str = 'D') -> go.Figure:
|
156 |
+
"""
|
157 |
+
Create a volume chart aggregated by time window
|
158 |
+
|
159 |
+
Args:
|
160 |
+
transactions_df: DataFrame of transactions
|
161 |
+
time_window: Time window for aggregation (e.g., 'D' for day, 'H' for hour)
|
162 |
+
|
163 |
+
Returns:
|
164 |
+
Plotly figure object
|
165 |
+
"""
|
166 |
+
# Create an empty figure with appropriate message if no data
|
167 |
+
if transactions_df.empty:
|
168 |
+
fig = go.Figure()
|
169 |
+
fig.update_layout(
|
170 |
+
title="No Transaction Data Available",
|
171 |
+
xaxis_title="Date",
|
172 |
+
yaxis_title="Volume",
|
173 |
+
height=400,
|
174 |
+
template="plotly_white"
|
175 |
+
)
|
176 |
+
fig.add_annotation(
|
177 |
+
text="No transactions found for volume analysis",
|
178 |
+
showarrow=False,
|
179 |
+
font=dict(size=14)
|
180 |
+
)
|
181 |
+
return fig
|
182 |
+
|
183 |
+
try:
|
184 |
+
# Create a deep copy to avoid modifying the original
|
185 |
+
df = transactions_df.copy()
|
186 |
+
|
187 |
+
# Ensure timestamp column exists and convert to datetime
|
188 |
+
if 'Timestamp' in df.columns:
|
189 |
+
timestamp_col = 'Timestamp'
|
190 |
+
elif 'timeStamp' in df.columns:
|
191 |
+
timestamp_col = 'timeStamp'
|
192 |
+
else:
|
193 |
+
# Create a dummy timestamp if none exists
|
194 |
+
df['dummy_timestamp'] = pd.date_range(start='2025-01-01', periods=len(df), freq='H')
|
195 |
+
timestamp_col = 'dummy_timestamp'
|
196 |
+
|
197 |
+
# Convert timestamp to datetime safely
|
198 |
+
if not pd.api.types.is_datetime64_any_dtype(df[timestamp_col]):
|
199 |
+
try:
|
200 |
+
df[timestamp_col] = pd.to_datetime(df[timestamp_col].astype(float), unit='s')
|
201 |
+
except Exception as e:
|
202 |
+
print(f"Error converting timestamp: {str(e)}")
|
203 |
+
df[timestamp_col] = pd.date_range(start='2025-01-01', periods=len(df), freq='H')
|
204 |
+
|
205 |
+
# Ensure amount column exists
|
206 |
+
if 'Amount' in df.columns:
|
207 |
+
amount_col = 'Amount'
|
208 |
+
elif 'tokenAmount' in df.columns:
|
209 |
+
amount_col = 'tokenAmount'
|
210 |
+
elif 'value' in df.columns:
|
211 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
212 |
+
if 'tokenDecimal' in df.columns:
|
213 |
+
df['adjustedValue'] = df['value'].astype(float) / (10 ** df['tokenDecimal'].astype(int))
|
214 |
+
amount_col = 'adjustedValue'
|
215 |
+
else:
|
216 |
+
amount_col = 'value'
|
217 |
+
else:
|
218 |
+
# Create a dummy amount column if none exists
|
219 |
+
df['dummy_amount'] = 1.0
|
220 |
+
amount_col = 'dummy_amount'
|
221 |
+
|
222 |
+
# Alternative approach: manually aggregate by date to avoid index issues
|
223 |
+
df['date'] = df[timestamp_col].dt.date
|
224 |
+
|
225 |
+
# Group by date
|
226 |
+
volume_data = df.groupby('date').agg({
|
227 |
+
amount_col: 'sum',
|
228 |
+
timestamp_col: 'count'
|
229 |
+
}).reset_index()
|
230 |
+
|
231 |
+
volume_data.columns = ['Date', 'Volume', 'Count']
|
232 |
+
|
233 |
+
# Create figure
|
234 |
+
fig = go.Figure()
|
235 |
+
|
236 |
+
# Add volume bars
|
237 |
+
fig.add_trace(go.Bar(
|
238 |
+
x=volume_data['Date'],
|
239 |
+
y=volume_data['Volume'],
|
240 |
+
name='Volume',
|
241 |
+
marker_color='blue',
|
242 |
+
opacity=0.7
|
243 |
+
))
|
244 |
+
|
245 |
+
# Add transaction count line
|
246 |
+
fig.add_trace(go.Scatter(
|
247 |
+
x=volume_data['Date'],
|
248 |
+
y=volume_data['Count'],
|
249 |
+
name='Transaction Count',
|
250 |
+
mode='lines+markers',
|
251 |
+
marker=dict(color='red'),
|
252 |
+
yaxis='y2'
|
253 |
+
))
|
254 |
+
|
255 |
+
# Update layout
|
256 |
+
fig.update_layout(
|
257 |
+
title="Transaction Volume Over Time",
|
258 |
+
xaxis_title="Date",
|
259 |
+
yaxis_title="Volume",
|
260 |
+
yaxis2=dict(
|
261 |
+
title="Transaction Count",
|
262 |
+
overlaying="y",
|
263 |
+
side="right"
|
264 |
+
),
|
265 |
+
height=500,
|
266 |
+
template="plotly_white",
|
267 |
+
hovermode="x unified",
|
268 |
+
legend=dict(
|
269 |
+
orientation="h",
|
270 |
+
yanchor="bottom",
|
271 |
+
y=1.02,
|
272 |
+
xanchor="right",
|
273 |
+
x=1
|
274 |
+
)
|
275 |
+
)
|
276 |
+
|
277 |
+
return fig
|
278 |
+
|
279 |
+
except Exception as e:
|
280 |
+
# If any error occurs, return a figure with error information
|
281 |
+
print(f"Error in create_volume_chart: {str(e)}")
|
282 |
+
fig = go.Figure()
|
283 |
+
fig.update_layout(
|
284 |
+
title="Error in Volume Chart",
|
285 |
+
xaxis_title="",
|
286 |
+
yaxis_title="",
|
287 |
+
height=400,
|
288 |
+
template="plotly_white"
|
289 |
+
)
|
290 |
+
fig.add_annotation(
|
291 |
+
text=f"Error generating volume chart: {str(e)}",
|
292 |
+
showarrow=False,
|
293 |
+
font=dict(size=14, color="red")
|
294 |
+
)
|
295 |
+
return fig
|
296 |
+
|
297 |
+
def plot_volume_by_day(self, transactions_df: pd.DataFrame) -> go.Figure:
|
298 |
+
"""
|
299 |
+
Create a volume chart aggregated by day with improved visualization
|
300 |
+
|
301 |
+
Args:
|
302 |
+
transactions_df: DataFrame of transactions
|
303 |
+
|
304 |
+
Returns:
|
305 |
+
Plotly figure object
|
306 |
+
"""
|
307 |
+
# This is a wrapper around create_volume_chart that specifically uses day as the time window
|
308 |
+
return self.create_volume_chart(transactions_df, time_window='D')
|
309 |
+
|
310 |
+
def plot_transaction_flow(self, transactions_df: pd.DataFrame) -> go.Figure:
|
311 |
+
"""
|
312 |
+
Create a network flow visualization of transactions between wallets
|
313 |
+
|
314 |
+
Args:
|
315 |
+
transactions_df: DataFrame of transactions
|
316 |
+
|
317 |
+
Returns:
|
318 |
+
Plotly figure object
|
319 |
+
"""
|
320 |
+
if transactions_df.empty:
|
321 |
+
# Return empty figure if no data
|
322 |
+
fig = go.Figure()
|
323 |
+
fig.update_layout(
|
324 |
+
title="No Transaction Flow Data Available",
|
325 |
+
xaxis_title="",
|
326 |
+
yaxis_title="",
|
327 |
+
height=400,
|
328 |
+
template="plotly_white"
|
329 |
+
)
|
330 |
+
fig.add_annotation(
|
331 |
+
text="No transactions found for flow analysis",
|
332 |
+
showarrow=False,
|
333 |
+
font=dict(size=14)
|
334 |
+
)
|
335 |
+
return fig
|
336 |
+
|
337 |
+
try:
|
338 |
+
# Ensure from/to columns exist
|
339 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
340 |
+
from_col, to_col = 'From', 'To'
|
341 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
342 |
+
from_col, to_col = 'from', 'to'
|
343 |
+
else:
|
344 |
+
# Create an error visualization
|
345 |
+
fig = go.Figure()
|
346 |
+
fig.update_layout(
|
347 |
+
title="Transaction Flow Error",
|
348 |
+
xaxis_title="",
|
349 |
+
yaxis_title="",
|
350 |
+
height=400,
|
351 |
+
template="plotly_white"
|
352 |
+
)
|
353 |
+
fig.add_annotation(
|
354 |
+
text="From/To columns not found in transactions data",
|
355 |
+
showarrow=False,
|
356 |
+
font=dict(size=14, color="red")
|
357 |
+
)
|
358 |
+
return fig
|
359 |
+
|
360 |
+
# Ensure amount column exists
|
361 |
+
if 'Amount' in transactions_df.columns:
|
362 |
+
amount_col = 'Amount'
|
363 |
+
elif 'tokenAmount' in transactions_df.columns:
|
364 |
+
amount_col = 'tokenAmount'
|
365 |
+
elif 'value' in transactions_df.columns:
|
366 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
367 |
+
if 'tokenDecimal' in transactions_df.columns:
|
368 |
+
transactions_df['adjustedValue'] = transactions_df['value'].astype(float) / (10 ** transactions_df['tokenDecimal'].astype(int))
|
369 |
+
amount_col = 'adjustedValue'
|
370 |
+
else:
|
371 |
+
amount_col = 'value'
|
372 |
+
else:
|
373 |
+
# Create an error visualization
|
374 |
+
fig = go.Figure()
|
375 |
+
fig.update_layout(
|
376 |
+
title="Transaction Flow Error",
|
377 |
+
xaxis_title="",
|
378 |
+
yaxis_title="",
|
379 |
+
height=400,
|
380 |
+
template="plotly_white"
|
381 |
+
)
|
382 |
+
fig.add_annotation(
|
383 |
+
text="Amount column not found in transactions data",
|
384 |
+
showarrow=False,
|
385 |
+
font=dict(size=14, color="red")
|
386 |
+
)
|
387 |
+
return fig
|
388 |
+
|
389 |
+
# Aggregate flows between wallets
|
390 |
+
flow_df = transactions_df.groupby([from_col, to_col]).agg({
|
391 |
+
amount_col: ['sum', 'count']
|
392 |
+
}).reset_index()
|
393 |
+
|
394 |
+
flow_df.columns = [from_col, to_col, 'Value', 'Count']
|
395 |
+
|
396 |
+
# Limit to top 20 flows to keep visualization readable
|
397 |
+
top_flows = flow_df.sort_values('Value', ascending=False).head(20)
|
398 |
+
|
399 |
+
# Create Sankey diagram
|
400 |
+
# First, create a mapping of unique addresses to indices
|
401 |
+
all_addresses = pd.unique(top_flows[[from_col, to_col]].values.ravel('K'))
|
402 |
+
address_to_idx = {addr: i for i, addr in enumerate(all_addresses)}
|
403 |
+
|
404 |
+
# Create source, target, and value arrays for the Sankey diagram
|
405 |
+
sources = [address_to_idx[addr] for addr in top_flows[from_col]]
|
406 |
+
targets = [address_to_idx[addr] for addr in top_flows[to_col]]
|
407 |
+
values = top_flows['Value'].tolist()
|
408 |
+
|
409 |
+
# Create hover text
|
410 |
+
hover_text = [f"From: {src}<br>To: {tgt}<br>Value: {val:.2f}<br>Count: {cnt}"
|
411 |
+
for src, tgt, val, cnt in zip(top_flows[from_col], top_flows[to_col],
|
412 |
+
top_flows['Value'], top_flows['Count'])]
|
413 |
+
|
414 |
+
# Shorten addresses for node labels
|
415 |
+
node_labels = [f"{addr[:6]}...{addr[-4:]}" if len(addr) > 12 else addr
|
416 |
+
for addr in all_addresses]
|
417 |
+
|
418 |
+
# Create Sankey diagram figure
|
419 |
+
fig = go.Figure(data=[go.Sankey(
|
420 |
+
node=dict(
|
421 |
+
pad=15,
|
422 |
+
thickness=20,
|
423 |
+
line=dict(color="black", width=0.5),
|
424 |
+
label=node_labels,
|
425 |
+
color="blue"
|
426 |
+
),
|
427 |
+
link=dict(
|
428 |
+
source=sources,
|
429 |
+
target=targets,
|
430 |
+
value=values,
|
431 |
+
label=hover_text,
|
432 |
+
hovertemplate='%{label}<extra></extra>'
|
433 |
+
)
|
434 |
+
)])
|
435 |
+
|
436 |
+
fig.update_layout(
|
437 |
+
title="Whale Transaction Flow",
|
438 |
+
font_size=12,
|
439 |
+
height=600,
|
440 |
+
template="plotly_white"
|
441 |
+
)
|
442 |
+
|
443 |
+
return fig
|
444 |
+
|
445 |
+
except Exception as e:
|
446 |
+
# If any error occurs, return a figure with error information
|
447 |
+
print(f"Error in plot_transaction_flow: {str(e)}")
|
448 |
+
fig = go.Figure()
|
449 |
+
fig.update_layout(
|
450 |
+
title="Error in Transaction Flow",
|
451 |
+
xaxis_title="",
|
452 |
+
yaxis_title="",
|
453 |
+
height=400,
|
454 |
+
template="plotly_white"
|
455 |
+
)
|
456 |
+
fig.add_annotation(
|
457 |
+
text=f"Error generating transaction flow: {str(e)}",
|
458 |
+
showarrow=False,
|
459 |
+
font=dict(size=14, color="red")
|
460 |
+
)
|
461 |
+
return fig
|
462 |
+
|
463 |
+
def generate_pdf_report(self,
|
464 |
+
transactions_df: pd.DataFrame,
|
465 |
+
patterns: List[Dict[str, Any]] = None,
|
466 |
+
price_impact: Dict[str, Any] = None,
|
467 |
+
alerts: List[Dict[str, Any]] = None,
|
468 |
+
title: str = "Whale Analysis Report",
|
469 |
+
start_date: datetime = None,
|
470 |
+
end_date: datetime = None) -> bytes:
|
471 |
+
"""
|
472 |
+
Generate a PDF report of whale activity
|
473 |
+
|
474 |
+
Args:
|
475 |
+
transactions_df: DataFrame of transactions
|
476 |
+
patterns: List of pattern dictionaries
|
477 |
+
price_impact: Dictionary of price impact analysis
|
478 |
+
alerts: List of alert dictionaries
|
479 |
+
title: Report title
|
480 |
+
start_date: Start date for report period
|
481 |
+
end_date: End date for report period
|
482 |
+
|
483 |
+
Returns:
|
484 |
+
PDF report as bytes
|
485 |
+
"""
|
486 |
+
buffer = io.BytesIO()
|
487 |
+
doc = SimpleDocTemplate(buffer, pagesize=letter)
|
488 |
+
elements = []
|
489 |
+
|
490 |
+
# Add title
|
491 |
+
styles = getSampleStyleSheet()
|
492 |
+
elements.append(Paragraph(title, styles['Title']))
|
493 |
+
|
494 |
+
# Add date range
|
495 |
+
if start_date and end_date:
|
496 |
+
date_range = f"Period: {start_date.strftime('%Y-%m-%d')} to {end_date.strftime('%Y-%m-%d')}"
|
497 |
+
elements.append(Paragraph(date_range, styles['Heading2']))
|
498 |
+
|
499 |
+
elements.append(Spacer(1, 12))
|
500 |
+
|
501 |
+
# Add transaction summary
|
502 |
+
if not transactions_df.empty:
|
503 |
+
elements.append(Paragraph("Transaction Summary", styles['Heading2']))
|
504 |
+
summary_data = [
|
505 |
+
["Total Transactions", str(len(transactions_df))],
|
506 |
+
["Unique Addresses", str(len(pd.unique(transactions_df['from'].tolist() + transactions_df['to'].tolist())))]
|
507 |
+
]
|
508 |
+
|
509 |
+
# Add token breakdown if available
|
510 |
+
if 'tokenSymbol' in transactions_df.columns:
|
511 |
+
token_counts = transactions_df['tokenSymbol'].value_counts()
|
512 |
+
summary_data.append(["Most Common Token", f"{token_counts.index[0]} ({token_counts.iloc[0]} txns)"])
|
513 |
+
|
514 |
+
summary_table = Table(summary_data)
|
515 |
+
summary_table.setStyle(TableStyle([
|
516 |
+
('BACKGROUND', (0, 0), (0, -1), colors.lightgrey),
|
517 |
+
('GRID', (0, 0), (-1, -1), 1, colors.black),
|
518 |
+
('PADDING', (0, 0), (-1, -1), 6),
|
519 |
+
]))
|
520 |
+
elements.append(summary_table)
|
521 |
+
elements.append(Spacer(1, 12))
|
522 |
+
|
523 |
+
# Add pattern analysis
|
524 |
+
if patterns:
|
525 |
+
elements.append(Paragraph("Trading Patterns Detected", styles['Heading2']))
|
526 |
+
for i, pattern in enumerate(patterns):
|
527 |
+
pattern_text = f"Pattern {i+1}: {pattern.get('name', 'Unnamed')}\n"
|
528 |
+
pattern_text += f"Description: {pattern.get('description', 'No description')}\n"
|
529 |
+
if 'risk_profile' in pattern:
|
530 |
+
pattern_text += f"Risk Profile: {pattern['risk_profile']}\n"
|
531 |
+
if 'confidence' in pattern:
|
532 |
+
pattern_text += f"Confidence: {pattern['confidence']:.2f}\n"
|
533 |
+
|
534 |
+
elements.append(Paragraph(pattern_text, styles['Normal']))
|
535 |
+
elements.append(Spacer(1, 6))
|
536 |
+
|
537 |
+
elements.append(Spacer(1, 12))
|
538 |
+
|
539 |
+
# Add price impact analysis
|
540 |
+
if price_impact:
|
541 |
+
elements.append(Paragraph("Price Impact Analysis", styles['Heading2']))
|
542 |
+
impact_text = ""
|
543 |
+
if 'avg_impact' in price_impact:
|
544 |
+
impact_text += f"Average Impact: {price_impact['avg_impact']:.2f}%\n"
|
545 |
+
if 'max_impact' in price_impact:
|
546 |
+
impact_text += f"Maximum Impact: {price_impact['max_impact']:.2f}%\n"
|
547 |
+
if 'insights' in price_impact:
|
548 |
+
impact_text += f"Insights: {price_impact['insights']}\n"
|
549 |
+
|
550 |
+
elements.append(Paragraph(impact_text, styles['Normal']))
|
551 |
+
elements.append(Spacer(1, 12))
|
552 |
+
|
553 |
+
# Add alerts
|
554 |
+
if alerts:
|
555 |
+
elements.append(Paragraph("Alerts", styles['Heading2']))
|
556 |
+
for alert in alerts:
|
557 |
+
alert_text = f"{alert.get('level', 'Info')}: {alert.get('message', 'No details')}"
|
558 |
+
elements.append(Paragraph(alert_text, styles['Normal']))
|
559 |
+
elements.append(Spacer(1, 6))
|
560 |
+
|
561 |
+
# Build the PDF
|
562 |
+
doc.build(elements)
|
563 |
+
buffer.seek(0)
|
564 |
+
return buffer.getvalue()
|
565 |
+
|
566 |
+
def generate_csv_report(self,
|
567 |
+
transactions_df: pd.DataFrame,
|
568 |
+
report_type: str = "Transaction Summary") -> str:
|
569 |
+
"""
|
570 |
+
Generate a CSV report of transaction data
|
571 |
+
|
572 |
+
Args:
|
573 |
+
transactions_df: DataFrame of transactions
|
574 |
+
report_type: Type of report to generate
|
575 |
+
|
576 |
+
Returns:
|
577 |
+
CSV data as string
|
578 |
+
"""
|
579 |
+
if transactions_df.empty:
|
580 |
+
return "No data available for report"
|
581 |
+
|
582 |
+
if report_type == "Transaction Summary":
|
583 |
+
# Return basic transaction summary
|
584 |
+
return transactions_df.to_csv(index=False)
|
585 |
+
elif report_type == "Daily Volume":
|
586 |
+
# Get timestamp column
|
587 |
+
if 'Timestamp' in transactions_df.columns:
|
588 |
+
timestamp_col = 'Timestamp'
|
589 |
+
elif 'timeStamp' in transactions_df.columns:
|
590 |
+
timestamp_col = 'timeStamp'
|
591 |
+
# Convert timestamp to datetime if needed
|
592 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
593 |
+
try:
|
594 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col].astype(float), unit='s')
|
595 |
+
except:
|
596 |
+
return "Error processing timestamp data"
|
597 |
+
else:
|
598 |
+
return "Timestamp column not found"
|
599 |
+
|
600 |
+
# Get amount column
|
601 |
+
if 'Amount' in transactions_df.columns:
|
602 |
+
amount_col = 'Amount'
|
603 |
+
elif 'tokenAmount' in transactions_df.columns:
|
604 |
+
amount_col = 'tokenAmount'
|
605 |
+
elif 'value' in transactions_df.columns:
|
606 |
+
amount_col = 'value'
|
607 |
+
else:
|
608 |
+
return "Amount column not found"
|
609 |
+
|
610 |
+
# Aggregate by day
|
611 |
+
transactions_df['date'] = transactions_df[timestamp_col].dt.date
|
612 |
+
daily_volume = transactions_df.groupby('date').agg({
|
613 |
+
amount_col: 'sum',
|
614 |
+
'hash': 'count' # Assuming 'hash' exists for all transactions
|
615 |
+
}).reset_index()
|
616 |
+
|
617 |
+
daily_volume.columns = ['Date', 'Volume', 'Transactions']
|
618 |
+
return daily_volume.to_csv(index=False)
|
619 |
+
else:
|
620 |
+
return "Unknown report type"
|
621 |
+
|
622 |
+
def generate_png_chart(self,
|
623 |
+
fig: go.Figure,
|
624 |
+
width: int = 1200,
|
625 |
+
height: int = 800) -> bytes:
|
626 |
+
"""
|
627 |
+
Convert a Plotly figure to PNG image data
|
628 |
+
|
629 |
+
Args:
|
630 |
+
fig: Plotly figure object
|
631 |
+
width: Image width in pixels
|
632 |
+
height: Image height in pixels
|
633 |
+
|
634 |
+
Returns:
|
635 |
+
PNG image as bytes
|
636 |
+
"""
|
637 |
+
img_bytes = fig.to_image(format="png", width=width, height=height)
|
638 |
+
return img_bytes
|
requirements.txt
ADDED
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
streamlit==1.30.0
|
2 |
+
pandas==2.1.1
|
3 |
+
numpy==1.26.0
|
4 |
+
matplotlib==3.8.0
|
5 |
+
plotly==5.18.0
|
6 |
+
python-dotenv==1.0.0
|
7 |
+
requests==2.31.0
|
8 |
+
scikit-learn==1.3.1
|
9 |
+
crewai>=0.28.0
|
10 |
+
langchain>=0.1.0,<0.2.0
|
11 |
+
reportlab==4.0.5
|
12 |
+
weasyprint==60.1
|
test_api.py
ADDED
@@ -0,0 +1,205 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os
|
2 |
+
import sys
|
3 |
+
import json
|
4 |
+
import urllib.request
|
5 |
+
import urllib.parse
|
6 |
+
import urllib.error
|
7 |
+
from urllib.error import URLError, HTTPError
|
8 |
+
|
9 |
+
# Simple dotenv implementation since the module may not be available
|
10 |
+
def load_dotenv():
|
11 |
+
try:
|
12 |
+
with open('.env', 'r') as file:
|
13 |
+
for line in file:
|
14 |
+
line = line.strip()
|
15 |
+
if not line or line.startswith('#') or '=' not in line:
|
16 |
+
continue
|
17 |
+
key, value = line.split('=', 1)
|
18 |
+
os.environ[key] = value
|
19 |
+
except Exception as e:
|
20 |
+
print(f"Error loading .env file: {e}")
|
21 |
+
return False
|
22 |
+
return True
|
23 |
+
|
24 |
+
# Load environment variables
|
25 |
+
load_dotenv()
|
26 |
+
|
27 |
+
# Get API key from .env
|
28 |
+
ARBISCAN_API_KEY = os.getenv("ARBISCAN_API_KEY")
|
29 |
+
if not ARBISCAN_API_KEY:
|
30 |
+
print("ERROR: ARBISCAN_API_KEY not found in .env file")
|
31 |
+
sys.exit(1)
|
32 |
+
|
33 |
+
print(f"Using Arbiscan API Key: {ARBISCAN_API_KEY[:5]}...")
|
34 |
+
|
35 |
+
# Test addresses (known active ones)
|
36 |
+
TEST_ADDRESSES = [
|
37 |
+
"0x5d8908afee1df9f7f0830105f8be828f97ce9e68", # Arbitrum Treasury
|
38 |
+
"0x2b1ad6184a6b0fac06bd225ed37c2abc04415ff4", # Large holder
|
39 |
+
"0xc47ff7f9efb3ef39c33a2c492a1372418d399ec2", # Active trader
|
40 |
+
]
|
41 |
+
|
42 |
+
# User-provided addresses (from command line arguments)
|
43 |
+
if len(sys.argv) > 1:
|
44 |
+
USER_ADDRESSES = sys.argv[1:]
|
45 |
+
TEST_ADDRESSES.extend(USER_ADDRESSES)
|
46 |
+
print(f"Added user-provided addresses: {USER_ADDRESSES}")
|
47 |
+
|
48 |
+
def test_api_key():
|
49 |
+
"""Test if the API key is valid"""
|
50 |
+
base_url = "https://api.arbiscan.io/api"
|
51 |
+
params = {
|
52 |
+
"module": "stats",
|
53 |
+
"action": "ethsupply",
|
54 |
+
"apikey": ARBISCAN_API_KEY
|
55 |
+
}
|
56 |
+
|
57 |
+
try:
|
58 |
+
print("\n===== TESTING API KEY =====")
|
59 |
+
# Construct URL with parameters
|
60 |
+
query_string = urllib.parse.urlencode(params)
|
61 |
+
url = f"{base_url}?{query_string}"
|
62 |
+
print(f"Making request to: {url}")
|
63 |
+
|
64 |
+
# Make the request
|
65 |
+
with urllib.request.urlopen(url) as response:
|
66 |
+
response_data = response.read().decode('utf-8')
|
67 |
+
data = json.loads(response_data)
|
68 |
+
|
69 |
+
print(f"Response status code: {response.status}")
|
70 |
+
print(f"Response JSON status: {data.get('status')}")
|
71 |
+
print(f"Response message: {data.get('message', 'No message')}")
|
72 |
+
|
73 |
+
if data.get("status") == "1":
|
74 |
+
print("β
API KEY IS VALID")
|
75 |
+
return True
|
76 |
+
else:
|
77 |
+
print("β API KEY IS INVALID OR HAS ISSUES")
|
78 |
+
if "API Key" in data.get("message", ""):
|
79 |
+
print(f"Error message: {data.get('message')}")
|
80 |
+
print("β You need to register for an API key at https://arbiscan.io/myapikey")
|
81 |
+
return False
|
82 |
+
|
83 |
+
except HTTPError as e:
|
84 |
+
print(f"β HTTP Error: {e.code} - {e.reason}")
|
85 |
+
return False
|
86 |
+
except URLError as e:
|
87 |
+
print(f"β URL Error: {e.reason}")
|
88 |
+
return False
|
89 |
+
except Exception as e:
|
90 |
+
print(f"β Error testing API key: {str(e)}")
|
91 |
+
return False
|
92 |
+
|
93 |
+
def test_address(address):
|
94 |
+
"""Test if an address has transactions on Arbitrum"""
|
95 |
+
base_url = "https://api.arbiscan.io/api"
|
96 |
+
|
97 |
+
# Test for token transfers
|
98 |
+
params_token = {
|
99 |
+
"module": "account",
|
100 |
+
"action": "tokentx",
|
101 |
+
"address": address,
|
102 |
+
"startblock": "0",
|
103 |
+
"endblock": "99999999",
|
104 |
+
"page": "1",
|
105 |
+
"offset": "10", # Just get 10 for testing
|
106 |
+
"sort": "desc",
|
107 |
+
"apikey": ARBISCAN_API_KEY
|
108 |
+
}
|
109 |
+
|
110 |
+
# Test for normal transactions
|
111 |
+
params_normal = {
|
112 |
+
"module": "account",
|
113 |
+
"action": "txlist",
|
114 |
+
"address": address,
|
115 |
+
"startblock": "0",
|
116 |
+
"endblock": "99999999",
|
117 |
+
"page": "1",
|
118 |
+
"offset": "10", # Just get 10 for testing
|
119 |
+
"sort": "desc",
|
120 |
+
"apikey": ARBISCAN_API_KEY
|
121 |
+
}
|
122 |
+
|
123 |
+
print(f"\n===== TESTING ADDRESS: {address} =====")
|
124 |
+
|
125 |
+
# Check token transfers
|
126 |
+
try:
|
127 |
+
print("Testing token transfers...")
|
128 |
+
# Construct URL with parameters
|
129 |
+
query_string = urllib.parse.urlencode(params_token)
|
130 |
+
url = f"{base_url}?{query_string}"
|
131 |
+
|
132 |
+
# Make the request
|
133 |
+
with urllib.request.urlopen(url) as response:
|
134 |
+
response_data = response.read().decode('utf-8')
|
135 |
+
data = json.loads(response_data)
|
136 |
+
|
137 |
+
if data.get("status") == "1":
|
138 |
+
transfers = data.get("result", [])
|
139 |
+
print(f"β
Found {len(transfers)} token transfers")
|
140 |
+
if transfers:
|
141 |
+
print(f"First transfer: {json.dumps(transfers[0], indent=2)[:200]}...")
|
142 |
+
else:
|
143 |
+
print(f"β No token transfers found: {data.get('message', 'Unknown error')}")
|
144 |
+
|
145 |
+
except HTTPError as e:
|
146 |
+
print(f"β HTTP Error: {e.code} - {e.reason}")
|
147 |
+
except URLError as e:
|
148 |
+
print(f"β URL Error: {e.reason}")
|
149 |
+
except Exception as e:
|
150 |
+
print(f"β Error testing token transfers: {str(e)}")
|
151 |
+
|
152 |
+
# Check normal transactions
|
153 |
+
try:
|
154 |
+
print("\nTesting normal transactions...")
|
155 |
+
# Construct URL with parameters
|
156 |
+
query_string = urllib.parse.urlencode(params_normal)
|
157 |
+
url = f"{base_url}?{query_string}"
|
158 |
+
|
159 |
+
# Make the request
|
160 |
+
with urllib.request.urlopen(url) as response:
|
161 |
+
response_data = response.read().decode('utf-8')
|
162 |
+
data = json.loads(response_data)
|
163 |
+
|
164 |
+
if data.get("status") == "1":
|
165 |
+
transactions = data.get("result", [])
|
166 |
+
print(f"β
Found {len(transactions)} normal transactions")
|
167 |
+
if transactions:
|
168 |
+
print(f"First transaction: {json.dumps(transactions[0], indent=2)[:200]}...")
|
169 |
+
else:
|
170 |
+
print(f"β No normal transactions found: {data.get('message', 'Unknown error')}")
|
171 |
+
|
172 |
+
except HTTPError as e:
|
173 |
+
print(f"β HTTP Error: {e.code} - {e.reason}")
|
174 |
+
except URLError as e:
|
175 |
+
print(f"β URL Error: {e.reason}")
|
176 |
+
except Exception as e:
|
177 |
+
print(f"β Error testing normal transactions: {str(e)}")
|
178 |
+
|
179 |
+
def main():
|
180 |
+
"""Main function to run tests"""
|
181 |
+
print("=================================================")
|
182 |
+
print("Arbitrum API Diagnostic Tool")
|
183 |
+
print("=================================================")
|
184 |
+
|
185 |
+
# Test the API key first
|
186 |
+
api_valid = test_api_key()
|
187 |
+
|
188 |
+
if not api_valid:
|
189 |
+
print("\nβ οΈ Please update your API key in the .env file")
|
190 |
+
print("Register for an API key at https://arbiscan.io/myapikey")
|
191 |
+
return
|
192 |
+
|
193 |
+
# Test each address
|
194 |
+
for address in TEST_ADDRESSES:
|
195 |
+
test_address(address)
|
196 |
+
|
197 |
+
print("\n=================================================")
|
198 |
+
print("RECOMMENDATIONS:")
|
199 |
+
print("1. If your API key is invalid, update it in the .env file")
|
200 |
+
print("2. If test addresses work but yours don't, your addresses might not have activity on Arbitrum")
|
201 |
+
print("3. Use one of the working test addresses in your app for testing")
|
202 |
+
print("=================================================")
|
203 |
+
|
204 |
+
if __name__ == "__main__":
|
205 |
+
main()
|