Microsoft Adopts Apache Hadoop For Data Analytics

Open source Apache Hadoop to become Microsoft’s Windows Azure and Server big data analyser

Microsoft’s plans for tackling big data within the enterprise include an Apache Hadoop-based distribution for Windows Server and Windows Azure.

Apache Hadoop is a scalable solution for companies looking to crunch massive amounts of data, sorting through it to find the tendencies and patterns necessary to make better business decisions. Organisations already using the open-source framework for production or educational purposes include eBay, Facebook, Hulu, IBM, Twitter, and a handful of universities.

Elephantine Analysis Project

Yahoo nurtured Hadoop as a “science project” of sorts for six years before it split off under the umbrella of an independent, venture capital-funded company called Hortonworks.

Microsoft announced the Hadoop augmentation for Windows Server and Windows Azure – which obviously includes a strategic partnership with Hortonworks – at its Pass Summit 2011. “The next frontier is all about uniting the power of the cloud with the power of data to gain insights that simply weren’t possible even just a few years ago,” Microsoft corporate vice president Ted Kummert told the audience during his opening keynote.

Microsoft plans for a community technology preview of the Hadoop service for Windows Azure by the end of this year, followed by the CTP for the Hadoop service for Windows Server sometime in 2012. In a release associated with the announcement, Microsoft also pledged to “work closely with the Hadoop community” and “propose contributions back to the Apache Software Foundation and the Hadoop project”.

“Over 80 percent of new data being generated is from unstructured sources,” Eric Baldeschwieler, CEO of Hortonworks, wrote in a statement released by Microsoft. “We are excited to work with Microsoft to help make Apache Hadoop a compelling platform for storing and processing data.”

Microsoft used the PASS Summit 2011 to show off a “Data Explorer” prototype for sharing and discovering business data, which will apparently be included with the Windows Azure Marketplace. Microsoft is also planning a set of “highly interactive” data visualisation tools which leverage touch technology. That seems an attempt to leverage upcoming releases such as Windows 8, whose user interface will offer a significant touch-based component.