Data analytics has never been more important to the modern world and society. In the tech community, that statement would be so readily accepted that it’s practically a truism.
More importantly, the wider world too is now more aware than ever of data analytics’ transformative and hugely consequential adoption by wide swathes of industry. Indeed, In 2017 The Economist famously described data as the ‘new oil’.
At the point it’s really beginning to be understood by the mainstream, data analytics is in a process of transitioning into something new.
At the turn of the century, data analytics was very much ‘batch based’ – in so much that it was about pulling data from the concept of a data lake, a large slow place to put all the data that you might want later on.
But modern society is very much about real time, and in our ‘always on’ world – driven by the internet and social media – batch based analytics simply don’t work.
For example, say you go online to do some shopping. You find your product, and on the site you are told there are six items left in the warehouse. You go to make the purchase, and are then told it will take two weeks because it’s not in stock. Are you happy with that situation, or will you go elsewhere for your shopping?
Accurate, real time analytics is absolutely vital to be able to feed customer satisfaction in the world we live in now. The data sets are getting bigger and bigger, and more and more people are now realising that data can be monetised.
Data should not be seen as a hindrance to your organisation, or something that you get rid of. You keep data, you accumulate it, and the more data you have the more business value you can drive.
From science fiction to reality
If you observe the latest applications of data analytics, and especially when it comes to implementations of machine learning and AI, you’ll notice some big advances being driven from several angles.
With regards to machine learning, there are three factors pushing it forwards; better algorithms, much larger data sets, and access to massive amounts of computer resource.
Put those three things together and you have huge levels of adoption – and the reason we are seeing AI moving from science fiction to reality. AI is underpinning so much of what we do today, whether it’s in healthcare, manufacturing, or finance.
Of course, it’s the hardware that makes all of this possible. To take one case study, Nvidia managed to move from producing graphics cards for computer games, to producing GPUs that underpin pretty much all of the world of AI today. Nvidia’s GPUs are based on having not five or ten or 20 cores in a CPU, but having thousands. That means massive parallelism, which supports advanced AI and analytics workloads.
Whether it’s storage, networking, or compute, organisations put new hardware in place to deliver on analytics and AI workloads, and once they have, they want to get maximum value from them.
To do this, keeping those environments busy is key. And this can only be done by feeding it data as quickly as possible. And that’s where hosting that data on flash based, massive bandwidth, massively parallel systems is the best way to do that.
As quick as a flash
Flash brings a lot to the table, especially in the world of analytics, big data and machine learning.
Firstly, flash storage is a very fast data platform by nature. You can write data to it fast, access it fast, and get insights back fast.
It’s also more reliable compared to spinning disk media, and the density means instead of taking up literally racks and racks in a data centre in order to host a large data set, you can consolidate that into ‘rack units’ – fractions of a rack.
Instead of having storage arrays that look like multiple fridges in a data centre, storage arrays based on flash technology look more like a standard server. It takes up a fraction of the space, and of course that means reducing the carbon footprint.
With flash technology you can also aggregate lots of different workloads onto a single flash blade platform, whereas historically you would have data silos. Now each of those data silos were there for a good reason – the infrastructure in each silo had characteristics that made it suitable for certain use cases, such as a data warehouse, or real time analytics, or AI, or a relational data base.
As a platform flash blade can meet all of those use cases – and we call it a data hub. This means you don’t have to have lots of disparate types of technology, and it also keeps costs down. This is what is making flash a compelling area for businesses looking to leverage analytics and machine learning.
The AI data pipeline
Speed is hugely important when it comes to AI. Here as well, flash provides a massive speed advantage all the way through what we call the AI data pipeline.
The pipeline roughly looks like this: It starts with ingesting data, then there is cleaning, transforming, and labelling that data, then we go on to exploring and understanding what the right algorithms are to make use of the data, and then finally training the models.
Being able to progress through that pipeline as quickly as possible provides business value. Whether the mission is to deliver autonomous driving capabilities, computer models to aid diagnosis in healthcare, or innovations in financial services, efficiency and faster time to insight are the key factors.
That’s what flash has over the old world of spinning disk, which cannot keep up with modern demands of performance.
There is no putting the cork back in the AI bottle
In terms of the advancement of analytics and machine learning, we’re on a journey now which is almost unstoppable. The world of AI has gone through several waves of progress, all the way back to the sixties and seventies. And now it feels like it is here to stay.
There is no putting the cork back in the bottle for AI. People continue to accumulate more and more data, and to be able to make sense of that data you need really advanced analytics or machine learning methodologies.
There’s also the competitive nature of industry to take into account. People know that their competition is going to be investing in accumulating knowledge data sets and getting insight from that data, and if they don’t keep up then they are no longer competitive. That competitive tension is really driving this area forward at an incredible rate.
These data sets aren’t shrinking any time soon. Because of the value of data, businesses are accumulating more and more of it – they don’t want to delete what might be useful in five or ten year’s time. And that means the data sets are growing.
The future is incredibly bright for AI, machine learning and analytics. Flash has firmly established its place in the data centres that facilitate those areas, and will continue to displace spinning disk technology.
The new world of data presents opportunity – and challenge. Data shared when and where needed is the engine of value. Convert data to value with a data-centric-architecture across people, places and clouds.