Cloudera Releases Project Impala, Offers Real-Time Hadoop Analytics

Big Data expert Cloudera has delivered Project Impala, a fast query system which uses more-or-less standard SQL to query large Apache Hadoop databases.

The open source Impala engine was built from scratch over two years and supports all the common file and data formats, while offering Hadoop analytics in real-time through an optional subscription module.

Speed of thought

Project Impala entered a public beta stage in October 2012, long before anyone else attempted to build a massively parallel processing query engine that’s native to Hadoop. Since then, over 40 enterprise customers and open source experts have been putting the platform through its paces, including 37signals, Expedia and Six3 Systems. Cloudera has worked hard to refine Impala in real world applications to deliver a release suitable for enterprise workloads.

With Impala, users can query data stored in HDFS and HBase directly. The framework supports all common file and data formats, allowing users to share and reuse information from a single dataset across all computing workloads. This unique approach eliminates the need to migrate datasets into specialized systems or proprietary formats for analytics purposes.

To accompany the software, Cloudera is offering Enterprise RTQ – an optional subscription module that adds technical support and management automation to Impala, designed for enterprise customers. According to the company, it is the first data management solution that moves Apache Hadoop decisively “beyond batch,” enabling users to handle real-time workloads that previously required investment in expensive data warehouse solutions.

“We believe that for Hadoop to cross over to the enterprise, it must become a first class citizen with IT, the business and the data centre,” commented Tony Baer, principal analyst of Software and Enterprise Solutions at Ovum.

“A large part of making Hadoop a first-class citizen in the enterprise is making it accessible to the large base of SQL developers and applications that already exist. With Impala, Cloudera has decisively planted the stake in bringing the worlds of Hadoop and enterprise SQL together. And it has done so in a way that addresses the expectations for performance that are taken for granted in the enterprise SQL world.”

You can download Impala 1.0 and Cloudera’s Hadoop distribution for free on the company’s website.

Meanwhile, Cloudera’s competitor MapR has just released the M7 – its flagship NoSQL Hadoop platform. It offers simplified, automated administration for HBase, increased performance, lower latency and gets rid of JVM (Java Virtual Machine) -related bottlenecks.

How well do you know open source software? Take our quiz!

Originally published on eWeek.

Max Smolaks

Max 'Beast from the East' Smolaks covers open source, public sector, startups and technology of the future at TechWeekEurope. If you find him looking lost on the streets of London, feed him coffee and sugar.

Recent Posts

TikTok Viewed As Chinese Influence Tool By Most Americans – Poll

Most people in the United States view TikTok as a Chinese influence tool a poll…

5 hours ago

Ofcom Confirms OnlyFans Investigation Over Age Verification

UK regulator confirms it is investigating whether OnlyFans is doing enough to prevent children accessing…

6 hours ago

Ex Google Staff Fired Over Israel Protest File NLRB Complaint

Dismissed staff file complaint with a US labor board, and allege Google unlawfully terminated their…

7 hours ago

Tesla Axes Entire Supercharger Team, Plus Senior Executives

Elon Musk dismisses two senior Tesla executives, plus the entire division that runs Tesla's Supercharger…

8 hours ago

Microsoft, OpenAI Sued By More Newspaper Publishers

Eight newspaper publishers in the US allege Microsoft and OpenAI used their millions of their…

10 hours ago

Binance’s Changpeng Zhao Sentenced To Four Months In Prison

US judge sentences Binance founder, Changpeng Zhao, to four months in prison for ignoring money…

13 hours ago