Labhesh Patel, chief technology officer and chief scientist at Jumio, talks about AI, the challenges of data protection and how he learned to go offline for 25 minutes at a stretch
What is your role and who do you work for?
I am the CTO and chief scientist at Jumio. We are a biometric identity verification provider that uses computer vision technology, machine learning, augmented intelligence and live verification experts to verify government-issued documents (e.g. passports, drivers’ licenses, etc.) issued by over 220 countries and territories.
As CTO and chief scientist, part of my job is to make sure that our major technology initiatives are successful. One of our current, large-scale initiatives is our cloud migration process, where we’re moving all of our applications from being hosted on-premise to a microservices-based cloud deployment. In addition, we’re constantly trying to improve our products with new functionalities, which my team is responsible for.
As chief scientist, one of my big focus areas is to figure out how to use Jumio’s massive data assets to bring about more automation. We have conducted over 150 million transactions to date, and as part of that, we’re looking at the data and learnings from those transactions to use machine learning, augmented intelligence and AI to bring about automation and improve processes. These technologies are helping us improve our verification accuracy, fraud detection and verification speed.
How long have you been in IT?
This is my 18th year in IT officially, but actually 20 if you count the time before I earned my master’s degree at Stanford University. I’ve worked at a mix of larger organisations, like Cisco, as well as a number of startups, including one that Apple acquired.
What is your most interesting project to date?
We are working on a range of extremely interesting projects at Jumio right now. One of these involves the matching of a person’s face to the face on his or her ID document, which relates to the identity verification element of our business. It’s quite a difficult task because users may look quite a bit younger in the photo on their ID documents than they currently are at the time we ask them to capture a real-time selfie. Therefore, it can be difficult to match the two IDs.
Over the last few months we have been looking for ways to use deep learning, augmented intelligence and AI to enhance this process, and the result is our Face Match feature. We trained an AI model to look at faces and then tell us whether or not the faces belong to the same people. We showed the model about 4 million images and it is now able to select similar faces with astounding accuracy.
AI gets better as you provide the model with more data. Luckily, we have conducted tens of millions of facial recognitions, so we have tens of millions of selfies and matching IDs to work with. This gives us the opportunity to improve this model with even more data, and right in front of our eyes, we are seeing something come to reality that humans physically cannot do.
What is your biggest challenge at the moment?
As we all know, the GDPR went live in May of this year and this has completely changed the machine learning process, affecting the process of data exchange between data collectors (our business clients) and a data processors (such as Netverify).
Historically, online identity verification providers would pool similar customers and use cases into a single, large data pool to develop and train its machine learning algorithms. There is a direct connection between the amount of data and the quality of the algorithm. While GDPR does not explicitly forbid aggregating customer data for machine learning purposes, it’s our perspective that the spirit of the law requires vendors to segregate the data from different customers in order to develop customer-specific machine learning algorithms.
As a data processor, this means that we won’t ‘co-mingle’ the ID and identity information from different clients into a larger pool so we can develop more accurate machine learning algorithms. So, if one of our clients (i.e. a bank) sends us a feed of data, and another client (i.e. another bank), sends us another data feed, we will not mix these data feeds and create a model that benefits from both sets of data. Instead, we’re leveraging Jumio AI Labs to make sure that we train our models within each customer silo. As you can imagine, the infrastructure required to do this segregation and create highly predictive models is extremely complex.
What technology were you working with ten years ago?
Even ten years ago I was working in AI but we were using a lot of classical techniques for machine learning, including models such as logistic and linear regression. Most machine learning tasks in those days required a lot of feature engineering, as we did not have access to the architectures and the computing power that power the deep models of today.
What is your favourite technology of all time?
TensorFlow, a deep learning framework that was open sourced by Google a couple of years ago, is definitely one of my favourite technologies. It has dramatically changed the deep learning landscape, allowing data scientists to be extremely efficient. It’s an amazing technology for end-to-end machine learning — not only is it great from an experimentation perspective, but it has amazing modules for putting your models into production.
How will the Internet of Things affect your organisation?
Online identity verification is at the core of what we do, and the need to establish trust online from different endpoints increases as the number of IoT devices that people can transact business on continues to grow. Take Amazon’s Alexa for example — if a user wants to buy a high-ticket item via that device, we need to verify the identity of the person who is trying to conduct that transaction. As more people rely on these smart devices that can connect to your payment apps, the number of endpoints there are for Jumio on which identity verification is needed also grows.
With IoT, there are also more and more sensors around, which means we can get better location data for people who are transacting business. The better the location data is from these GPS signals, the better our risk scoring works. For example, if we see an ID coming in from Palo Alto, and we’re getting the same location from other IoT devices, we know that the person is in Palo Alto. That risk profile is very different if an ID is from Palo Alto but the location data from an IoT device comes from a totally different location.
What smartphone do you use?
I use an iPhone X and love it.
What three apps could you not live without?
I use Gmail and Google Maps a lot. I also use Pomodoro, a timer that allows you to work in 25-minute cycles so that you ultimately work more productively. It means that I focus on one thing solely for that cycle without getting distracted by things like email and IMs. It would be hard for me to go offline for several hours at a stretch, but the 25-minute cycles are much more manageable.
What new technology are you most excited for a) your business and b) yourself?
Deep learning is what I’m most excited for, both on a personal level and for Jumio. While traditional machine learning is all about showing the machine data and telling it what to focus on to train it, with deep learning, you just have to show the machine the data and it can learn what to focus on.
This makes it extremely powerful because then you can focus solely on the network architecture, making the process extremely fast. There are lots of network architectures that keep coming up in academic papers that you can start from and then use your data to modify those network architectures very minutely, and the result will be extremely powerful.
What deep learning has allowed is an influx of ideas from academic circles and the likes of Google and Facebook, for new companies to use their data sets in conjunction with the new models to create really powerful results, and this has never really been possible before.
From a business perspective, we’ve seen some great models as a result of deep learning, like Jumio’s Face Match. And on a personal level, I am always on the lookout for fresh techniques.
If you weren’t doing the job you do now, what would you be doing?
If I were not working at Jumio I would be reading up on the latest developments in deep learning and working on them myself — this is something I currently only have time to do in the evening and on weekends. I start each day meditating and exercising, so between that and spending time with my 5-year-old, there isn’t time for much else!