HP: Deduplication And The Future Of Tape

Data deduplication needs to expand out of the backup market and become more efficient, but many companies are still struggling to phase out tape, according to HP

Tape is sticking around

Commenting on the general state of the storage market, Johns predicted that, despite efforts to move towards SSD storage, tape will continue to be around for a long time – particularly as it can be stored without using power.

“Over time technologies change. It’s too expensive to put everything on solid state today, but in 10 years time, we’ll probably be looking at a very different media in terms of how we think about storage,” he said.

“There are a lot of companies that I speak to that are really trying to understand how to phase out tape. I haven’t met too many that have actually done it completely yet, but they are looking at how they will do it,” he added. “Just the fact that people have all that data on tape anyway, they’re going to have to keep tape around just to be able to restore the data if they ever need to.”

HP was very keen to point out that its converged architecture offers customers a one-stop-shop, but does not lock them in to a single architecture. “If you don’t want to buy it from us you don’t have to. If we build the best stuff and we build it on the lowest cost infrastructure, and it’s easier to manage and more common, that’s how we win,” said Johns.

HP’s converged infrastructure operates on the premise that it is open, giving customers greater freedom of choice. Johns claims that, unlike competitors EMC and NetApp, HP is able to discuss with customers whether they need a centralised shared storage system, or whether DAS (direct attached storage) would be sufficient.

“Increasingly, customers are building their exchange architectures on DAS storage. Those companies [EMC and NetApp] will not have that conversation with them because they’re always going to force fit them into the solution that they have. We don’t have to do that, we can have a conversation about the right architecture,” he said.

Storage still growing

Storage is now reportedly the only major part of IT expenditure that is still growing out of control. This is due to the fact that, despite overall tightening in IT budgets, the amount of storage that users have on site – and the amount they spend on it – continues to grow.

“Storage is the last bastion of IT that hasn’t been modernised,” said Nikki Wilton, director of data management at Unisys, in an eWEEK debate, Stamp Out Storage Waste, in March. Its share in the IT hardware spend actually grew from ten percent to 17 percent during the course of 2008 and 2009.

Hardware vendors are therefore torn between offering users a way to rein in those costs, or exploiting them mercilessly while the storage boom goes on. The reality falls somewhere between the two extremes.

Earlier this year, EMC revealed data protection software called DD Boost at its annual World Conference, which it claimed was the first optimise and accelerate disk-based backup integrated with deduplication storage.

DD Boost distributes the deduplication process, identifying data segments as they arrive in a Data Domain storage system. It analyses the segments to determine which blocks are new, then compresses and forwards only the unique segments to the storage array. EMC claims that this can reduce local network traffic by a margin of up to 95 percent, because redundant data segments do not enter the array in the first place.

IBM has also been making large strides in the storage space recently, after unveiling several new tiered storage products in April. According to Brian Truskowski, general manager for storage in IBM’s Systems and Technology Group, businesses are struggling with the volume and evolving nature of the data they collect, and they’re under huge pressure to turn this data into insight.

“This is what clients want – they are looking for storage technology that will improve efficiency and reduce costs, increase capacity and intelligently place the data where it can perform the best for workloads like real-time analytics, database or transaction processing,” he said.