All systems operational

Critical host failure in FRA

Resolved
Partial outage
Started 8 months ago Lasted about 5 hours

Affected

Europe
Updates
  • Resolved
    Resolved

    This incident has been resolved.

  • Monitoring
    Monitoring

    We successfully recovered the impacted servers.

    Some workload are still being recovered automatically.

  • Identified
    Identified

    We identified the root cause of the issue, some of the physical servers are lost because of high load, it seems that some reschedule of the impacted workload triggered a cascaded failure on other nodes.

    We are actively trying to recover access to the impacted servers.

    Some workload are still impacted and will soon recover.
    New build and container start are also impacted.

  • Investigating
    Investigating

    We are seing a high increase of container restart and container boot failure in FRA.

    We are currently investigating this incident.