Analytics Speeds Doubled By IBM Storage Architecture

Announced at the Supercomputing 2010 show, IBM’s new architecture can double the speed of analytics processing

At the Supercomputing 2010 conference, IBM released details of a new storage architecture design that can double analytics processing speeds.

Announced at the conference in New Orleans, the new General Parallel File System – Shared Nothing Cluster (GPFS-SNC) architecture, created by IBM researchers, can convert terabytes of pure information into actionable insights twice as fast as previously possible. IBM also won the Storage Challenge competition for presenting the most innovative and effective design in high performance computing with the best measurements of performance, scalability and storage subsystem utilisation.

High Availability Clustering

The company said the GPFS-SNC system is primarily useful for cloud computing applications and data-intensive workloads such as digital media, data mining and financial analytics and it can cut hours off complex computations without requiring heavy infrastructure investment.

Created at IBM Research in Almaden, the architecture is designed to provide higher availability through advanced clustering technologies, dynamic file system management and advanced data replication techniques. By “sharing nothing”, new levels of availability, performance and scaling are achievable. GPFS-SNC is a distributed computing architecture in which each node is self-sufficient; tasks are then divided up between these independent computers and no one waits on the other, IBM said in a press release on the new architecture.

“Businesses are literally running into walls, unable to keep up with the vast amounts of data generated on a daily basis,” said Prasenjit Sarkar, a master inventor in the storage analytics and resiliency unit at IBM Research, in a statement. “We constantly research and develop the industry’s most advanced storage technologies to solve the world’s biggest data problems. This new way of storage partitioning is another step forward on this path as it gives businesses faster time-to-insight without concern for traditional storage limitations.”

Analytics has been identified as one of IBM’s core areas of interest over the next five years, along with a focus on growth markets, cloud computing and its Smarter Planet initiative. In a five-year outlook speech at IBM’s annual meeting with financial analysts, IBM CEO Samuel Palmisano outlined IBM’s 2015 roadmap in which he singled out analytics as a driver. Moreover, he said IBM would spend $20 billion on acquisitions between 2011 and 2015 – Big Blue already has spent $14 billion making 24 acquisitions related to analytics.

Analytics: An Important Moneyspinner For IBM

In a recent interview, Rob Ashe, general manager for business analytics at IBM, told eWEEK, “Analytics is a key part of our 2015 roadmap. Last year, analytics contributed $9 billion (£5.6 billion) to our revenues and we expect to see that grow to $16 billion (£10 billion) in 2015.”

Running analytics applications on extremely large data sets is becoming increasingly important but organisations can only continue to increase the size of their storage facilities so much, IBM said. As businesses search for ways to harness their large stored data to achieve new levels of business insight, they need alternative solutions like cloud computing to keep up with growing data requirements as well as tackling workload flexibility through the rapid provisioning of system resources for different types of workloads.

IBM’s current GPFS technology offering is the core technology for IBM’s High Performance Computing Systems, IBM’s Information Archive, IBM Scale-Out NAS (SONAS), and the IBM Smart Business Compute Cloud. These research lab innovations enable future expansion of those offerings to further tackle tough big data problems, IBM said.

For instance, large financial institutions run complex algorithms to analyse risk based on petabytes of data, IBM said. With billions of files spread across multiple computing platforms and stored across the world, these mission-critical calculations require significant IT resource and cost because of their complexity. Using this GPFS-SNC design, running this complex analytics workload could become much more efficient, as the design provides a common file system and namespace across disparate computing platforms, streamlining the process and reducing disk space, IBM officials said.