Lightning Takes Down Amazon And Microsoft Clouds

Lightning strikes cut the power to two major Amazon and Microsoft data centres and disabled backup systems in Dublin on Sunday, resulting in up to twelve hours of downtime.

Lightning struck a transformer which Amazon said resulted in a fire and an explosion and then a total power outage. As well as Amazon’s Elastic Compute Cloud (EC2) and Elastic Block Storage (EBS) services being affected, Microsoft BPOS services also went down.

The power of the bolt was such that part of the phase control system that synchronises the backup generators was disabled, said Amazon on its Service Health Dashboard. It said it began investigating connectivity issues at around 03:00 GMT yesterday – twelve hours later it was still grappling to restore 100 percent access.

Customers Left With Nowhere To Go

Microsoft told eWEEK Europe UK in a statement that at around the same time a widespread power outage caused connectivity issues for European BPOS customers. Services were restored to all customers around seven hours later, it said. In the past year, the BPOS service worldwide has seen several outages and at least one data breach. Microsoft is trying to move customers across to the recently-launched Office 365.

Just six days ago, an article on the Daily Telegraph Website says Microsoft’s Dublin data centre includes a “comprehensive system of secondary electricity sources” and the whole operation could switch seamlessly to Amsterdam in the event of a “major catastrophe”. Microsoft would not say whether this system had come into play during yesterday’s power outage when asked by eWEEK Europe UK, but it appears it did not.

By 15:00, Amazon’s dashboard had reported that 75 percent of the EC2 instances affected had been recovered but the large scale of disruption meant manual intervention was necessary before the remaining EBS volumes and EC2 instances could be restored.

“While many volumes will be restored over the next several hours, we anticipate that it will take 24-48 hours until the process is completed,” it said at that time. “In some cases EC2 instances or EBS servers lost power before writes to their volumes were completely consistent.

“Because of this, in some cases we will provide customers with a recovery snapshot instead of restoring their volume so they can validate the health of their volumes before returning them to service,” Amazon promised.

Among the Websites affected were the Telegraph’s puzzles page, an Amazon customer, and the Edinburgh Book Festival. Service-level agreement (SLA) terms are rarely made public but it would be reasonable to assume that, barring future downtime this year, Amazon at 99.86 percent  and Microsoft at 99.92 percent uptime will have some penalties to pay to their customers, assuming most of them hold a 99.99 percent SLA.

Microsoft’s Dublin site is its largest data centre outside of the US and its green credentials are heavily touted. For example, it uses Dublin’s naturally cool air for cooling rather than relying on power intensive refrigeration. Amazon opened its data centre in Dublin in 2008 and is planning to expand the centre with the conversion of a 240,000 sq feet (22,300 sq metres) building.

David Jamieson

View Comments

  • In one incident Amazon have set back the cloud computing market by years.

    Amazon and other cloud computing vendors make the case that they are the IT experts with regard to hosting and provisioning of utility based infrastructure and suggest to customers they can manage the infrastructure better than they can.

    They have been proven wrong.

    All other cloud vendors will get tarred with the same brush.

    http://grahamsblog4444.blogspot.com/

  • Gabriel Chaher, vice president, EMEA/APAC marketing, Quantum:

    “The latest outage from Amazon, where lightning caused a service disruption to Amazon's EC2 cloud computing platform in Dublin, is proof that basic data availability housekeeping must not be neglected. Simply moving data into a more flexible cloud based environment will not eliminate availability problems.

    “A resilient backup strategy will help to restore customer trust in public cloud services and will encourage widespread adoption. By keeping more than one logical, physical and site copy of data, the customer, or service provider, can be assured that data is always available for recovery.”

Recent Posts

Russia Accused Of Cyberattack On Germany’s Ruling Party, Defence Firms

German foreign minister warns Russia will face consequences for “absolutely intolerable” cyberattack on ruling party,…

21 hours ago

Alphabet Axes Hundreds Of Staff From ‘Core’ Organisation

Google is reportedly laying off at least 200 staff from its “Core” organisation, including key…

22 hours ago

Apple Announces Record Share Buyback, Amid iPhone Sales Decline

Investor appeasement? Apple unveils huge $110 billion share buyback program, as sales of iPhone decline…

1 day ago

Tesla Backs Away From Gigacasting Manufacturing – Report

Tesla retreats from pioneering gigacasting manufacturing process, amid cost cutting and challenges at EV giant

2 days ago

US Urges No AI Control Of Nuclear Weapons

No skynet please. After the US, UK and France pledge human only control of nuclear…

2 days ago