What's New with Hadoop? Big Data Supercomputing

More than four years in the making, the Apache Software Foundation announced the general availability of Hadoop 2.0 this week and it will likely be a significant advancement in how Hadoop is used for collecting and managing big data.

The most noteworthy addition in Hadoop 2.0 is a technology dubbed YARN, an operating system of sorts that runs applications such as MapReduce and Storm. Enterprises using Hadoop will be able to run multiple applications simultaneously for more efficient support of data throughout its lifecycle. MapReduce, a batch processor that lines up search jobs that go into the Hadoop distributed file system or HDFS) in order to extract information. In the previous version of Hadoop's MapReduce, users could only run one job at a time.

"With the release of stable Hadoop 2, the community celebrates not only an iteration of the software, but an inflection point in the project's development. We believe this platform is capable of supporting new applications and research in large-scale, commodity computing," said Apache Hadoop Vice President Chris Douglas. "The Apache Software Foundation creates the conditions for innovative, community-driven technology like Hadoop to evolve. When that process converges, the result is inspiring."