Actian distributes its database into Hadoop for faster Big Data work that can be carried out using widely-available SQL skills
Actian has announced what it calls the first full integration of big data platform Hadoop with a traditional SQL database, promising a new “end to end”analytics running natively in Big Data’s favourite platform, but also providing “industrial grade” SQL access and query tools.
The announcement may draw criticism from other Big Data and analytics vendors, all of whom have promised integration between Hadoop and SQL – including Cloudera’s announcement of Impala – but Actian says its Analytics Platform SQL Edition brings the two together at a deeper and more productive level, “empowering millions of business-savvy SQL users and business analysts to conduct advanced analytics directly on data in the Hadoop Distributed File System (HDFS).”
Ready for Actian?
Actian, originally known as Ingres, has a vector database designed for Big Data, but it has kept some critical distance from Hadoop, labelling it as esoteric with a demand for highly-skilled knowledge workers that means it is only ready for experimentation.
“While Hadoop is a tremendous step forward as pioneers drive business value from their big data assets, optimal adoption is limited to the privileged few with deep pockets and rarefied skills,” said Actian CTO Mike Hoskins back in February.
Despite this, Hadoop has shown a lot of strength in handling large amounts of data, by processing it on distributed hardware, leading Actian vice president Ketan Karia to tell TechWeekEurope “SQL has all the skills but no data, while Hadoop has all the data, but no ability to get value out.”
The Hadoop community has acknowledged this, and addressed it in various ways, but Emma McGrattan, SVP of engineering at Actian was critical of all other approaches.
She said big traditional firms such as Teradata and Microsoft have used connectors to export data from Hadoop to their SQL databases, losing the ability to do real-time analysis on full datasets, while other approaches have wrapped legacy software up to make it fit into the Hadoop world.
Meanwhile, smaller firms such as Splice and CitusData are implementing an open source relational database such as PostgreSQL or Apache Derby in their systems, where they handle data converted to SQL: “They have processing on Hadoop, but the data is not HDFS data,” said McGrattan.
Firms like Cloudera which are aiming for full integration haven’t made it yet, she said, describing their apprach as “integrated but immature”.
Actian says it has placed its X100 database engine onto every node, so that Hadoop data can be queried using SQL. producing results which comply with the ACID requirements for transaction processing, and with established analytics tools also available.
The result is fast, says Actian, claiming it runs 30 times faster than Cloudera’s Impala on similar hardware.