Categories: Big DataData Storage

IBM Invests Heavily In ‘Important’ Open Source Apache Spark Big Data Project

IBM has pledged significant resources and up to 3,500 researchers to the Apache Spark big data platform, which the company calls “the most important new open source project in a decade that is being defined by data.”

Spark was originally developed in 2009 at the AMPLab at UC Berkeley, of which IBM is a founding partner, and has gained popularity because of its perceived ease of use and efficient memory management.

Its supporters claim Spark is 100 times faster at analysing data in memory using Hadoop’s MapReduce and ten times faster than disk. Spark had 465 contributors as of 2014, making it the most active project in the Apache Software Foundation and open source Big Data project.

IBM Spark

IBM says this commitment from the open source community means Spark is in a constant state of improvement and wants to aid the development of the platform with its own contributions.

“IBM has been a decades long leader in open source innovation. We believe strongly in the power of open source as the basis to build value for clients, and are fully committed to Spark as a foundational technology platform for accelerating innovation and driving analytics across every business in a fundamental way,” said Beth Smith, general manager of IBM’s analytics platform. “Our clients will benefit as we help them embrace Spark to advance their own data strategies to drive business transformation and competitive differentiation.”

Spark is to be built into IBM’s analytics and commerce platforms and the company will offer Spark as a cloud service through BlueMix. Watson Health Cloud will use the engine to help medical researchers analyse population health data and IBM’s own SystemML machine learning technology is to be open sourced to aid Spark’s development.

Up to 3,500 researchers and developers will work on Spark-related projects across the world and IBM has committed to educating more than one million data scientists and data engineers about the platform.

High profile users of Spark include NASA and the SETI Institute, which are analysing terabytes of deep complex radio signals to see if there is evidence of extra-terrestrial life.

Our Big Data Quiz is the same size as all our others!

Steve McCaskill

Steve McCaskill is editor of TechWeekEurope and ChannelBiz. He joined as a reporter in 2011 and covers all areas of IT, with a particular interest in telecommunications, mobile and networking, along with sports technology.

Recent Posts

Amazon Pumps Another $2.75 Billion Into Anthropic

Amazon completes its $4bn investment into AI firm Anthropic, after providing an additional $2.75bn in…

58 mins ago

The Sustainability of AI

While AI promises unparalleled efficiency, productivity, and innovation, questions regarding its environmental impact loom large.…

4 hours ago

Trump’s Truth Social Makes Successful Market Debut

Shares in Donald Trump’s social media company rose about 16 percent after first day of…

4 hours ago

Dutch PM Raises Cyber Espionage Case With China’s Xi

Beijing visit sees Dutch Prime Minister Mark Rutte discuss cyber espionage incident with Chinese President…

5 hours ago

Vodafone Germany Confirms 2,000 Job Losses, Amid European Restructuring

More downsizing at Vodafone after German operation announces 2,000 jobs will be axed, as automation…

21 hours ago

AI Poses ‘Jobs Apocalypse’, Warns Report

IPPR report warns AI could remove almost 8 million jobs in the United Kingdom, with…

22 hours ago