Oracle’s big data lead explains the difference between ‘good hard’ and ‘bad hard’ as well as the future ethics of AI and machine learning
The combination of cloud, Internet of Things (IoT) and big data analytics is seen as a pillar of digital transformation, but the final part of that equation is still proving difficult, at least in Oracle’s eyes.
Ian Sharp, lead data scientist at Oracle, told Big Data World in London that even before many organisation are ready to process the enormous amount of data generated by the IoT, they are struggling to make sense of existing sources.
“Big data is proving harder for us than previously thought,” he explained, differentiating between ‘bad hard’ and ‘good hard’.
Good v Bad Big Data
The former, he said, was when organisations were finding it difficult to gain insights – either because of cultural or technological issues – and were scrapping projects before they bore fruit because of expense.
The assumption that big data projects yield immediate results makes people who sign off the cheques impatient.
‘Good hard’, Sharp said, was when you were able to process the data but making the best use of it was the challenge.
“The challenge is getting the data into the system and getting it to work that is so great,” he concluded. “[Organisations are] hamstrung by trying to get it to work before it can.”
This where Sharp said Oracle could come in, helping deliver the platforms that allow data scientists to gain insights. He said he wasn’t ‘anti do-it-yourself’ but believed that by taking away some of the ‘bad hard’, companies could focus on ‘good hard’.
Sharp cited Oracle’s work with CERN, which process 15GB worth of daily log files a day and Perform’s Opta, which provides football statistics to broadcasters and governing bodies, as examples of customers who are now at the good hard stage because they have different challenges.
Next for analytics
Sharp said he has worked in the fields of machine learning, data science and business intelligence for years – long before the term Big Data was coined – and that the next step was data ethics as well as the ability to recognise emotion in audio and video. He even suggested we could one day even tell if someone is being boring.
“The thing we all talk about is the rise of cognitive services, deep learning and potential use of robots,” he elaborated.
“My personal view is that audio tech is much further ahead than visual, at least in the mainstream. We can identify the speaker fairly easily [using frequencies]. Video stuff is a bit further behind.”
Sharp also noted the limitations of some machine learning systems. He used Microsoft’s controversial ‘celebslikeme’ as an example, claiming that submitting a picture of Leonard DiCaprio did not return an image of the actor himself but someone else.
“[Microsoft CEO] Satya Nadella talks about democratising AI and [Oracle founder] Larry Ellison says every technology since fire has been used for good and bad,” he said. “It’s about finding the right, ethical user cases.”