Study Points To Environmental Impact of Data Universe


Researcher IDC and storage infrastructure provider EMC claim there is a cost to storing everything, and that’s in people, power and cooling

 In a report released May 18 that ostensibly takes roll of everything digital — up to and including the calendar year ending at midnight, Dec. 31, 2008 — IDC and EMC estimate the so-called digital universe to be 487 billion gigabytes in size, give or take a few bytes.

Apparently to get a head start on next year’s universal digital roundup, EMC, one of the world’s major data storage companies, is keeping track in public of all the new bytes coming into the world. Check out this Web page, which includes a rather fast-moving “Worldwide Information Growth Ticket.”

For the third straight year, researcher IDC and EMC have attempted to quantify all things digital; that is, to find and account for all the digital data in the world, wherever it resides.

This study includes everything: email, instant messages, voice recordings, telephone answering machines, plain old documents, photos, video, graphics, data logs, TiVO recordings, business transaction data — everything. EMC and IDC claim to have accounted for it all; at least this is what they tell eWEEK.

There are obvious commercial reasons for spending the time, money and effort to try and come up with such an accounting. EMC certainly wants to sell you storage capacity, data security, and control over all your data, so you can master your own part of that ever-growing digital universe; rocket science it is not.

On the other side, IDC wants to get its arms around how much data it is researching — along with the trends involved — so it can resell all this information. Both are perfectly reasonable motivations.

But the real questions are these: How accurate is it all? Do we look at this story and shrug, roll our eyes, or what? How does this affect you and me, if at all?

“You have to realize that we (IDC) covers the IT from a very broad perspective,” Dave Reinsel, group vice president of storage and semiconductors at IDC, told eWEEK.

“We have all this data we can leverage, whether it’s from digital surveillance cameras, digital TVs, servers, storage … we cover pictures, megapixels, resolution, everything. So why not try to get a handle on it all?”

IDC knows how many storage devices have been shipped, thus it also knows how much capacity is actually available, Reinsel said, whether it is in the form of spinning disk drives, tape, optical or flash storage. Then, IDC can make educated assumptions on how much capacity is utilised, based on its own regular surveys of users and vendors.

“Our methodology in measuring content creation is very rigid,” Reinsel said. “Obviously modeling is involved in certain segmentations; for example, we have very aggressive assumptions around RFID, which has packets so small and so tight that it doesn’t even compare to the impact of digitising a Hollywood movie.”

People will continue to create and/or copy an increasing number of pictures, phone calls, emails, blogs, and videos each day. Enterprises are capturing daily transaction records and adding to their data warehouses by the second.

The amount of security-intensive information also is rising precipitously — most noticeably in the video surveillance sector.

Thanks to increasing legal and commercial regulation [U.S. examples of this are HIPAA and Sarbanes-Oxley], governments are requiring that more digital information be retained and be available for health-care and legal reasons.

IDC claims that the amount of new information created in 2008 is roughly — very roughly — the equivalent of more than 237 billion fully-loaded Amazon Kindle wireless reading devices; 4.8 quadrillion online bank transactions; 3 quadrillion Twitter feeds; 162 trillion digital photos; 30 billion fully-loaded Apple iPod Touches; or 19 billion fully-loaded Blu-ray DVDs.

During the time it took you to read that last paragraph, the numbers probably went out of date. Such is the futility of trying to get a true fix on all this content.

Here’s another question: In this recessionary economy, do enterprises generally have the ability to keep ahead of this data growth and maintain control over their crown jewels — customer, partner, market and internal information?

“That’s the key question: Is this growth out of control?” Reinsel said. “Another question might be: How do we do information conservation the right way? How do we get better at it?

“Storage is relatively cheap; people think it is easier to just store everything than to clean and store it. But there is a cost to storing everything, and that’s in people, power and cooling to maintain it. Better to consider deduplication, compression, thin-provisioning — features like those — which result in less and better data stored.”

Reinsel said companies are fully aware of these technologies but that adoption remains slow anyway.

“Companies need to take a holistic look at how they process and store their data,” Reinsel said. “They need to pay more attention and have a broader view.”

Click to read the authors bio  Click to hide the authors bio