Microsoft Brings Hadoop Big Data To Windows Azure

Microsoft makes Hadoop-based HDInsight generally available as part of big data push

Microsoft has launch launched its Azure cloud-based Hadoop service  HDInsight, explaining the move is part of its plan to bring big data to one billion people, and has suggested it signals a shift similar to that which made data-processing tools such as Excel commonplace.

“Microsoft’s perspective is that embracing the new value of data will lead to a major transformation as significant as when line of business applications matured to the point where they touched everyone inside an organisation,” said Quentin Clark, corporate vice president of Microsoft’s data platform group, ahead of a Tuesday keynote at Strata + Hadoop World 2013 in New York City.

‘Major transformation’

Microsoft Windows Azure HDInsight Hadoop 2 (2)HDInsight, offered on Microsoft’s Windows Azure cloud platform, includes the Apache Hadoop data processing platform along with its associated tools. The offering uses Hortonworks’ flagship Hadoop distribution, Hortonworks Data Platform (HDP).

Part of Microsoft’s strategy to popularise big data is to integrate HDInsight with Microsoft’s own business intelligence tools, including Excel, SQL Server and PowerBI, Clark said. As part of that integration, Microsoft worked with Hortonworks on HDP, contributing tens of thousands of lines of code that are to be included back into the open source Apache Hadoop code base, Clark said.

Hortonworks made HDP 2.0 generally available last week, and HDP 2.0 for Windows Server will become generally available next month, according to Clark. He said Hadoop v2 will be supported by HDInsight in “a future update”, but no definite timetable has yet been announced.

Big Data vision

Microsoft’s vision for big data is similar to the its original mission of putting a computer on every desktop, but the company has recognised this approach is in need of updating in the post-PC era. Clark said the big data “transformation” will occur “when anyone with a question that can be answered by data, gets their answer”.

Integration with Microsoft’s BI tools means that, for instance, a user can deploy an Excel feature called Power BI to draw data from Hadoop MapReduce, which can then be automatically analysed and visualised.

The version of Hadoop used by HDInsight will be fully compatible with HDP, allowing workloads to be easily shifted to non-Azure deployments, and Microsoft is also billing HDP for Windows Server as offering an easy migration path to the cloud-based HDInsight.

Microsoft has been testing HDInsight in full production mode with several clients for “a number of months”, according to Clark, with customers including the City of Barcelona, which uses it to analyse the effectiveness of public services, and Virginia Tech, which uses it to offer access to DNA sequencing tools and resources.

Amazon Web Services also offers Hadoop, while Rackspace and IBM have both said they will offer cloud-based Hadoop distributions soon.

Do you know all about IT in the movies? Take our quiz!