British Airways Blames Power Surge For IT Outage, But Questions Still Remain

British Airways still appears to be struggling to get its operations back in order after a power outage caused an IT failure on Saturday.

Despite having two data centres and a significant IT infrastructure, BA said that a surge in electricity knocked its data centre near Heathrow offline and caused disruption in its check-in and operating systems. It noted that when power was restored it caused damage to the servers further adding to the outage.

“There was a total loss of power at the data centre. The power then returned in an uncontrolled way causing physical damage to the IT servers,” BA said in a statement. “It was not an IT issue, it was a power issue.”

Question, questions

However, other than blaming a power surge and adamantly declaring that the outage and the length of time the systems have been down was not due to outsourcing IT support from the UK to India to save costs, there has been little information from BA as to why its IT outage was so severe.

And the narrative is further muddied with the GMB union having claimed outsourcing has lead to a brain drain in IT expertise for the handling of such an IT outage and the execution of a robust disaster recovery plan.

Then there are reports noting that local electricity providers to the Heathrow area denied that had been any power surges, despite BA’s insistence that a power surge was to blame for knocking out their main data centre and backup systems.

Various comments from data centre experts and our own knowledge of levels of protection modern data centres have against power surges and outages, such as backup power supplies, surge protection and several uninterruptible power supplies (UPS) providing  multiple levels of redundant systems should a UPS fail, make us question how robust BA’s data centre architecture and design is given how its backup failed.

A spokesperson from BA explained to Silicon that it is currently investigating the reason why its backup did not spin-up and  mitigate a significant chunk of the chaos the outage has caused. However, that may take some time to come to light as the airline is currently focusing on getting its operations back to normal and its customers to their booked destinations.

Speculation for an explanation

Silicon‘s Roland Moore-Colyer went on BBC Click Radio to discuss the BA outage, and while no clear conclusions were uncovered, there was definitely some solid speculations that could explain why BA has ended up in a rather costly and embarrassing situation.

The first being that BA simply did not have the right technical skills or managerial nous to have a strong and reliable disaster recovery plan in action. While many cloud and managed services providers offer such services they can be expensive, and BA has experience in running complex IT infrastructure so perhaps thought it could get by on its own in-house technical knowledge.

The second point is that BA has huge amours of IT infrastructure, some of it older than others, meaning that when multiple systems are knocked offline, ensuring data is correctly synced and up-to-date is no easy task when compared to companies using more modern and cloud-based infrastructure.

And the point that has been raised is whether BA would have coped better with such an IT outage, or if it would have suffered from one at all, if it had been using more cloud-based systems.

Given that Google, Amazon Web Services, and Microsoft Azure invest billions of dollars into building robust cloud services, Moore-Colyer suggested that perhaps BA would have been better pushing the infrastructure and data is could feasibly do so in the face of budgets and data regulations, into a major cloud platform.

However, it was pointed out that this is easier said than done. And despite major technology companies touting technology to facilitate the much lauded digital transformation, moving to the cloud is still a relatively novel concept on the whole.

For BA the IT outage has been a rather large disaster with the company likely to loose more in reputation than it will in compensation for passengers, serving as a lesson to other major organisations to ensure that they have enough systems in place and a robust strategy to handle power surges and IT outages, no matter how unlikely they may seem.

How much do you know about the cloud? Try our quiz!

Roland Moore-Colyer

As News Editor of Silicon UK, Roland keeps a keen eye on the daily tech news coverage for the site, while also focusing on stories around cyber security, public sector IT, innovation, AI, and gadgets.

Recent Posts

Raimondo Downplays Huawei Smartphone Chip

US Commerce Secretary Gina Raimondo says Huawei's flagship smartphone chip 'years behind' US technology, shows…

10 hours ago

Cloud Companies Reject Broadcom VMware Pricing Changes

Cloud companies, business user groups say Broadcom price changes do not address their concerns, as…

10 hours ago

UK Lawsuit Claims Grindr Shared HIV Status

Dating app Grindr sued over claims it shared sensitive user data, including HIV status, with…

11 hours ago

Meta Opens Quest VR OS To Third Party Gadget Makers

Meta Platforms opens operating system behind Quest virtual reality headsets to third parties amidst competition…

11 hours ago

EU Prepares Action Against ‘Addictive’ TikTok Lite Features

European Commission may ban rewards feature in recently launched TikTok Lite that it calls 'toxic…

12 hours ago

TikTok Says New US Ban Effort Would ‘Trample Free Speech’

US House of Representatives passes new bill combining TikTok measures with foreign aid, may face…

1 day ago