Putting Businesses On A Data Diet

Iain chidgey delphix

Virtualisation has created a glut of duplicated data. Virtualised databases can trim the fat away, says Iain Chidgey

When it comes to data, businesses have a weight problem. In spite of the great strides in virtualisation and ‘Big Data’ applications – and in part, because of them – organisations still face major problems caused by the sheer weight of data that they create, use and store.

Data isn’t just ‘big’: it’s heavy. Virtual machines do nothing to bring down the volume of information itself; in fact, server virtualisation has made it much easier to tolerate ‘weight gain’ when it comes to their data. While the price of storage may be falling, the preponderance of data is the cause of hidden costs that result from the complexity they add to project workflow.

data centre power dcim electricity pue © watcharakun ShutterstockDatabase copies spring up all over

A great example of this is the spawning of multiple copies of big databases. For every live database, there is an extended family of anywhere between two and 30 copies which are maintained by employees in different departments for testing and development, quality assurance, back-up, reporting, disaster recovery – the list goes on. All of these copies are sitting on physical hardware, and for each terabyte of original data, organisations produce, on average, eight additional terabytes of duplicate data.

This cost must not be calculated solely on the additional expense of storing eight terabytes-worth of data for each one terabyte database. This weight of data causes significant problems in getting the right information to the right teams at the right time, extends the time taken to prioritise, schedule and test application environments. The proliferation of different databases and versions means that it’s not uncommon for admins to spend eight hours just to set up a single, 20 minute application environment test.

As batch testing lengthens, it takes longer for project managers to develop satisfactory, error-free environments, adding either to severe delays in the project or product release date, or to unacceptable error rates. This can add weeks or months to projects, at serious financial and reputational cost to the organisation involved.

Fat databases make for sluggish firms

The experience of one of the largest ticket sellers in the US provides ample illustration of these hidden costs of data: The company found itself having to add three weeks to every project, simply to prepare its data. Each project also included an average 20 per cent schedule buffer to offset the delays in updating test data. The firm planned to embrace quality at the expense of speed; however, as data volumes grew it was forced to test with samples instead of the actual data, resulting in more bugs leaking into production.

In project management there are always tradeoffs between quality, cost and speed; however, in this company’s case, the attempt to embrace quality over speed actually led to lower quality, while the delays and outages meant additional costs.

Database replication is like binge-eating: while organisms grow fat from gorging unnecessary calories, organisations grow slower and more ponderous with the trillions of unneeded bits that they must consume. To be lean, agile and healthy, businesses need to go on a ‘data diet’.

Why not virtualise the database?

That’s exactly what the ticket company did, by taking advantage of new technologies that address the ways in which organisations manage their database; technologies that harness the power of virtualisation and apply them to databases themselves.

The new techniques of database virtualisation don’t involve running a database in a virtualised environment, but rather virtualising the database itself. Instead of employees creating multiple copies of the same database – each with their own additions and amendments, and varying degrees of data ‘freshness’ and accuracy – the technology makes a single copy of each database and presents each person with a virtual instance every time it is needed.

By providing a virtualised instance instead of replicating a new database every time, database virtualisation drastically reduces the demands on businesses’ storage infrastructure and removes the need for costly and time-consuming back-ups, synchronisation of multiple databases and provisioning a full-fat copy to each employee that needs one.

The impact of this technology is profound: Resources that have previously been dedicated to managing the multiple database copies are freed up and can support new requests from developers, analysts and various departments who previously couldn’t access the database. For the ticket seller mentioned above, the adoption of database virtualisation techniques enabled it to eliminate the 20 per cent schedule buffer, the three week lead time, and lowered the number of production bugs by a fifth –  all at a lower cost.

It’s time to change the way we think about both virtualisation and data. Today’s IT could not exist without virtualised servers, it’s caused organisations to grow fat and lazy when it comes to their data practices. What’s worse, affordable and plentiful storage means that the problems and costs are most often hidden from businesses, but nonetheless causing dangerous sclerosis. It’s high time that businesses performed a health check to see how much unnecessary weight they are carrying in their databases, and if necessary put themselves on a data diet.

Iain Chidgey is EMEA vice president of data management specialist Delphix