Apache Releases Stable Hadoop Version 1.0 Framework

Big DataCloudData StorageOpen SourceSoftwareWorkspace

The Apache Software Foundation has officially delivered the much-anticipated version 1.0 of the Apache Hadoop framework

The Apache Software Foundation (ASF) has officially released Apache Hadoop 1.0, the open-source software framework for scalable, distributed computing and data storage.

The 4 January release marks a major milestone six years in the making, and has achieved the level of stability and enterprise-readiness to earn the 1.0 designation, Apache officials said.

Stable And Reliable

“In addition to the major security improvements and support for HBase, the really big deal about version 1.0 is this is a release we feel that people can look at as very stable,” Apache Hadoop Vice President Arun Murthy told eWEEK. “The developer community is really up for supporting version 1.0, and we expect 1.0 adoption to be much faster than for other versions.”

Murthy said Apache Hadoop 1.0 reflects six years of development, production experience, extensive testing, and feedback from hundreds of knowledgeable users, data scientists and systems engineers, culminating in a highly stable, enterprise-ready release of the fastest-growing big data platform. It includes support for:

  • HBase (sync and flush support for transaction logging)
  • Security (strong authentication via Kerberos)
  • Webhdfs (RESTful API to HDFS)
  • Performance-enhanced access to local files for HBase
  • Other performance enhancements, bug fixes and features
  • All version 0.20.205 and prior 0.20.2xx features

Big Data

Apache Hadoop serves as a foundation of cloud computing and is at the epicentre of “big data” solutions, ASF officials said. Hadoop enables data-intensive distributed applications to work with thousands of nodes and exabytes of data. Hadoop also enables organizations to more efficiently and cost-effectively store, process, manage and analyze the growing volumes of data being created and collected every day. And it connects thousands of servers to process and analyze data at supercomputing speed.

“This release is the culmination of a lot of hard work and cooperation from a vibrant Apache community group of dedicated software developers and committers that has brought new levels of stability and production expertise to the Hadoop project,” Murthy said in a statement. “Hadoop is becoming the de facto data platform that enables organizations to store, process and query vast torrents of data, and the new release represents an important step forward in performance, stability and security.”

Hadoop Adoption

Hadoop has been referred to as a “Swiss army knife of the 21st century” and is widely deployed at organisations around the globe, including industry leaders from across the Internet and social networking landscape such as Amazon Web Services, AOL, Apple, eBay, Facebook, Foursquare, HP, LinkedIn, Netflix, The New York Times, Rackspace, Twitter and Yahoo.

Other technology leaders such as Microsoft and IBM have integrated Apache Hadoop into their offerings. Yahoo, an early pioneer, hosts the world’s largest known Hadoop production environment to date, spanning more than 42,000 nodes.

“Achieving the 1.0 release status is a momentous achievement from the Apache Hadoop community and the result of hard development work and shared learnings over the years,” said Jay Rossiter, senior vice president of the Cloud Platform Group at Yahoo, in a statement. “Apache Hadoop will continue to be an important area of investment for Yahoo. Today Hadoop powers every click at Yahoo, helping to deliver personalized content and experiences to more than 700 million consumers worldwide.”

As with all Apache products, Apache Hadoop software is released under the Apache License v2.0 and is overseen by a self-selected team of active contributors to the project. A Project Management Committee (PMC) guides the project’s day-to-day operations, including community development and product releases.

Apache Hadoop release notes, source code, documentation and related resources are available at http://hadoop.apache.org/.

Read also :
Click to read the authors bio  Click to hide the authors bio