โšก Power Grid Optimization with LSTM + PPO

This repository showcases a hybrid deep learning + reinforcement learning system for power grid optimization in Lauderdale County, AL. The system forecasts demand using a weather-informed LSTM model and trains a PPO-based agent to maintain stability and minimize blackout risk under stress.


๐Ÿ“ˆ Models

  • LSTM Demand Predictor
    A deep bidirectional LSTM with attention, trained on 4 years of TVA and weather data.

  • PPO Grid Policy
    Trained in a custom PowerGridEnv with generator output, transformer tap, and load shedding control.


๐Ÿง  Dataset Overview

  • Demand Data:
    Sourced from the U.S. EIA (TVA region, 2021โ€“2024)

    • Demand, Net Generation, Day-Ahead Forecasts, Interchange
  • Weather Data:
    Daily min/max temperatures + precipitation

    • From 5 major TVA-region airports via NOAA

๐Ÿงฎ LSTM Model

  • Architecture:
    2-layer bidirectional LSTM + attention, followed by global pooling and dense layers.

  • Key Features:

    • Rolling temperature windows, demand lags
    • Weekly mean demand, change rate
    • Temp volatility, extreme flags
  • Metrics:

    Metric Value
    Rยฒ 0.911
    RMSE 19,565 MWh
    Mean Error 713 MWh (overbias)
    Beats TVA Forecast 70.08% of days

๐Ÿค– PPO DRL Agent

  • Environment:
    PyPSA-based Lauderdale County grid

    • 6 generators (Nuclear, Hydro, CCGT)
    • Load centers with realistic demand shares
    • Thermal constraints, ramp limits, marginal costs
  • Action Space:

    • Generator control
    • Transformer tap shift
    • Load shedding (up to 20%)
  • Reward Design:
    โœ… Balance demand/supply, low thermal overload
    โŒ Penalize instability, overloads, excessive cost

  • Training:

    • Algorithm: PPO (SB3)
    • Timesteps: 400,000
    • VecNormalize, 5 eval episodes per 2048 steps
  • Metrics:

    Metric Value
    Mean Reward ~1480
    Explained Variance Up to 0.85
    Blackout Risk < 5%
    Load Shedding < 3% avg

Downloads last month
19
Video Preview
loading