Duncan Ross, director of data science at Teradata UK, explains how to find people with the right skills for your IT team
As anyone working in big data knows, if there is one thing that is expanding faster than data volumes it is the skills deficit – the gap between the demand for data scientists from industry, and the number available.
Data science is a demanding profession that relies on people with genuine intelligence, creativity, inquisitiveness, analytical ability and commercial savvy. But they are in short supply. In this article I will be looking at some of the most effective approaches organisations can take to bridge that skills gap.
Where is the skills gap?
Researchers at Gartner predict that 85 percent of Fortune 500 enterprises will be unable to effectively exploit data this year (2015). Their counterparts at TDWI Research, meanwhile, found that 46 per cent of businesses admitted to inadequate staffing or skill levels for big data analytics.
In its own research, Teradata found similarly worrying results. Almost three quarters of UK firms in the company’s OnePoll survey believed graduates lacked the necessary expertise for effective data analysis, while 60 percent of businesses running a big data project found it difficult to obtain the right mix of skills.
The problem is not in finding people who are technically capable, since many graduates are coming out of universities with coding skills. What is difficult is finding people who can conduct the coding from an analytical perspective – who will not only ensure that results are reliable but also relevant to the commercial aims they have been set.
In part it is a question of outlook. Great analysts are the sort of people who are very irritating at parties, because they always want to get behind the information they are presented with. But while they may be sceptics at one level, they are also full of enthusiasm about what can be done.
Yet if we are to overcome the skills gap we will need more than just enthusiasm. Universities are also producing graduates with good analytical skills, many of whom show that vital enthusiasm. But for them to be effective we also need to ensure we develop analysts with commercial awareness, because that is the context in which the most important analytics are conducted. It is quite possible for an analysis to be mathematically perfect and yet completely useless from a business perspective. Technical skills alone are not sufficient – it is all very well being able to build a random forest (an ensemble of decision trees – an analytical algorithm that has proved to be very accurate), but can the data analyst say why it has any importance to the business?
In the UK, an investigation by Nesta, an innovation charity founded with government funding, discovered “a severe shortage of UK data talent with the right skills”, with the dearth worsening in relation to distance from London.
Growing existing talent
One potential answer to this shortage of skills would be to develop talent in-house, which is entirely feasible, given the right supervision. If you give someone who is a good analyst the right access to a text analytics or graph analytics toolset, then with appropriate support they can become good at text or graph analytics. But much of the challenge is in finding the right people in the first place.
Unfortunately this in-house skill development is quite rare in the UK – it requires an organisation to have someone who already has the necessary expertise, plus the willingness and ability to pass that expertise on. This has led to some companies adopting a hybrid model, with some fully-skilled data scientists recruited from outside and others developed in-house, in order to be able to up-skill rapidly.
Another factor that has enabled more rapid growth of data science capability is the development of open source analytical and data packages – with the result that almost every technique is going to be available to the developing data scientist, because someone somewhere will have written the code for it and made it easily downloadable. There are many great open-source analytical packages, but three that deserve mention are D3.js (a technique used for visualisations), R (a language for statistical programming), and Python (a programming language with many statistical libraries).
Alongside this has come the growth in MOOCS (Massively Open Online Courses), allowing people to learn relevant coding skills quickly and cheaply.
An additional benefit from developing analytics abilities internally is that as staff develop their skill-set in a commercial context they are more immediately aware of the business objectives of the organisation for which they are working.
Teams, not unicorns
Team-building is another important tactic in tackling the skills gap. There is little point looking for the great, single all-rounder who can do everything – the mythical unicorn. Even if such people existed (and they may) they would be too expensive as they can walk into any job. It is much more profitable to look across the skill-set required and build a team to fulfil it.
Traditionally, analysts have worked alone, as individual contributors, either because of the way that organisations were set up, or because the software they used did not facilitate group working. Times have changed and it is generally agreed now that teams are more effective and more creative, dividing the work and balancing skill-sets and experience. It is also vitally important now that data scientists are able to transfer techniques from one field and apply them to another area of business. Team-working certainly facilitates this.
The role of education
Looking at the bigger picture, there are reasons to believe that the education system is starting to produce more data science-focused graduates and post-graduates – people who have more than just raw coding or analytical skills.
Of course, the ability to code is important. In fact there is an argument that it is becoming even more important in analytics. Before the emergence of open source software, analytics was in many ways gradually becoming more democratised, as interfaces increasingly focused on business rather than technical users, and it was possible to create a predictive model by pressing a few buttons., The arrival of open source tools, which tend to have weaker user interfaces and a focus on a language based approach, has placed fresh emphasis on strong coding skills. These coding skills are one area where universities have demonstrated the ability to teach effectively.
Nevertheless, the signs are encouraging because UK universities are now starting to address the whole question of skill provision, with an understanding that coding is an important part, but only a part, of the picture.
Institutions that are proving effective include (but aren’t limited to): the University of Dundee, University College London and the University of Brighton. Crucially, here, faculties are working together so that computer sciences are blended with business and mathematics, including problem-solving.
Of course, for this to work there needs to be improvement at the other end of the education picture – we need the school mathematics curriculum to provide additional emphasis on real-world data handling and statistical analysis.
The UK is facing a similar skills gap to the rest of the world, but as organisations including universities and data-related businesses take the initiative, we are beginning to tackle it – hopefully before it causes serious damage.
Are you a big data expert? Take our quiz to find out!