⚡ Power Grid Optimization with LSTM + PPO

This repository showcases a hybrid deep learning + reinforcement learning system for power grid optimization in Lauderdale County, AL. The system forecasts demand using a weather-informed LSTM model and trains a PPO-based agent to maintain stability and minimize blackout risk under stress.

📈 Models

LSTM Demand Predictor
A deep bidirectional LSTM with attention, trained on 4 years of TVA and weather data.
PPO Grid Policy
Trained in a custom PowerGridEnv with generator output, transformer tap, and load shedding control.

🧠 Dataset Overview

Demand Data:
Sourced from the U.S. EIA (TVA region, 2021–2024)
- Demand, Net Generation, Day-Ahead Forecasts, Interchange
Weather Data:
Daily min/max temperatures + precipitation
- From 5 major TVA-region airports via NOAA

🧮 LSTM Model

Architecture:
2-layer bidirectional LSTM + attention, followed by global pooling and dense layers.
Key Features:
- Rolling temperature windows, demand lags
- Weekly mean demand, change rate
- Temp volatility, extreme flags
Metrics:

Metric Value

R² 0.911

RMSE 19,565 MWh

Mean Error 713 MWh (overbias)

Beats TVA Forecast 70.08% of days

Metric	Value
R²	0.911
RMSE	19,565 MWh
Mean Error	713 MWh (overbias)
Beats TVA Forecast	70.08% of days

🤖 PPO DRL Agent

Environment:
PyPSA-based Lauderdale County grid
- 6 generators (Nuclear, Hydro, CCGT)
- Load centers with realistic demand shares
- Thermal constraints, ramp limits, marginal costs
Action Space:
- Generator control
- Transformer tap shift
- Load shedding (up to 20%)
Reward Design:
✅ Balance demand/supply, low thermal overload
❌ Penalize instability, overloads, excessive cost
Training:
- Algorithm: PPO (SB3)
- Timesteps: 400,000
- VecNormalize, 5 eval episodes per 2048 steps
Metrics:

Metric Value

Mean Reward ~1480

Explained Variance Up to 0.85

Blackout Risk < 5%

Load Shedding < 3% avg

Metric	Value
Mean Reward	~1480
Explained Variance	Up to 0.85
Blackout Risk	< 5%
Load Shedding	< 3% avg