What Banks Can Learn From The RBS IT Debacle

CAST Research Labs’ Jay Sappidi says banks can learn a lot from the RBS IT failure

After over a week, the NatWest IT omnishambles is petering out. Customers aren’t quite as furious as they were, meaning that as the hubbub subsides, the forensic analysis of how the proverbial excrement hit the fan can begin.

Other banks have been chortling to themselves, rubbing their grubby hands in glee, as NatWest owner RBS faced a barrage of criticism from customers and onlookers, some of whom put the blame on outsourcing of operations and firing of IT staff. CA Technologies has even been implicated and Governor of Bank of England Sir Mervyn King has promised a “very detailed inquiry” into the causes of the crash. The fallout, it seems, is rather serious.

Looking at things in a more constructive light, though, what can banks learn from the past week? Jay Sappidi, senior manager at US-based software implementation firm CAST Research Labs, talks to TechWeekEurope about just that.

‘Technical debt’

How can banks guard against upgrade and patching errors like those that hit NatWest?
Firstly, understand the state of their systems in terms of structural quality of the code, identifying all the bad code patterns. A simple concept that is emerging to measure the overall size of issues currently present in a system that needs to be fixed is called ‘technical debt’.

Assessing technical debt will give banks a baseline to the risk they are taking before they ‘upgrade’ (bearing in mind not all patches are ‘upgrades’). They should fix issues here before piling more upgrades into a system that may be destabilised by ‘upgrades’ that are not absolutely essential. Then the patch itself should be scrutinised for code quality.

Use of automated tools is highly recommended as no one developer will be an expert in all different technologies that make up a system and even if there is one, manually evaluating the integrity of the application structure at an architectural level is next to impossible. Poor quality code may include issues such as SQL injections, recursive statements and the capacity to cause memory leaks – all of which can bring down systems.

What could NatWest/RBS have done better here?
Without knowing the exact reason for the patch, it is hard to say how well thought-out the decision to go ahead was. No upgrade should have been implemented to a system with poor quality code – especially one as interlinked as a Payment System.

A fundamental problem in many organisations is that they don’t have any kind of measurement system, with risk indicators, that can guide them in situations like this. Even if they knew that their system was not stable, it takes a strong IT exec to stand in the way of a determined marketing person who wants to roll out new functionality, but that is why they exist.

The IT team has to stand-up and take responsibility for the quality of its system because applications are their ‘product’. Perhaps RBS should look at the information it holds about the quality of its applications and empower the IT team to veto new patches unless they are sure there will be no wider impact. They may need to invest in paying off some of the technical debt their systems carry, before applying any more patches.

What would you advise RBS to change as it recovers from this situation?
At this point it is imperative RBS carries out a thorough audit of its core systems to look for hidden technical debt. They would do this by looking at known defects and analysing the structure of their core systems at the code level.

They can then prioritise the issues IT needs to tackle to prevent future outages. If they don’t already have one, they should put in place a robust measurement programme which provides them with visibility and information on the transaction risks posed by the code quality to core systems. This kind of measurement programme also enables them to put in place a ‘quality gate’ through which all ‘upgrades’ need to pass and where, just like a Japanese car manufacturer, anyone single defect can halt code from rolling out into production.

Keen on stocks and shares in the tech world? Try our quiz!