Outage Report

Leland Boeman, Sep 28, 2021

The Solar Forecast Arbiter platform recently experienced two outages over the last week. All systems were down and

During these periods the Solar Forecast Arbiter dashboard and api were inaccessible.

These outages were due to infrastructure failures that resulted in a loss of network connectivity to applications hosted in our kubernetes cluster. We were unable to definitively determine the cause of these infrastructure failures, but we were able to identify repeated failures of non-essential processes that eventually caused an imbalance in workload between worker nodes. The failures identified have been fixed, and workload has been rebalanced between nodes. We also identified and corrected issues with error reporting and monitoring of our infrastructure.

We will continue to monitor the situation to minimize the possibility of downtime.