Researchers Identify Mobile Users Through Anonymous Location Data

Max 'Beast from the East' Smolaks covers open source, public sector, startups and technology of the future at TechWeekEurope. If you find him looking lost on the streets of London, feed him coffee and sugar.

Follow on: Google +

Anonymised location data? There’s no such thing

New research suggests that using a complex mathematical model, it is possible to identify smartphone users based on just four pieces of anonymised location data.

The study, published in Scientific Reports, claims that after analysing hourly movement patterns, researchers from Massachusetts Institute of Technology (MIT) and the Catholic University of Louvain were able to identify individuals with 95 percent accuracy using four data points on the map.

Nothing is secret

According to the team at MIT, a third of the applications available on Apple’s App Store access a user’s geographic location.  Skyhook resolves 400 million users’ Wi-Fi locations every day, and the geo-location of around half of all iOS and Android traffic is available to ad networks.

In most cases, this information is collected by third parties to improve services and target advertising. However, it turns out it can also be used to reveal the identity of the owner of a mobile device.

srep01376-f1To see how reliable this identification can be, a team of four researchers studied fifteen months of human mobility data for 1.5 million smartphone users. It found that “human mobility traces are highly unique”, and can act as a sort of fingerprint. But while a fingerprint is identified based on a minimum of 12 points, you need just four to successfully link the movements and identity of a smartphone user.

“In a dataset where the location of an individual is specified hourly, and with a spatial resolution equal to that given by the carrier’s antennas, four spatio-temporal points are enough to uniquely identify 95 percent of the individuals,” wrote the researchers in an article entitled “Unique in the Crowd: The privacy bounds of human mobility”.

What’s more concerning, even an approximate location can still serve the same purpose, since the “uniqueness” of traces doesn’t decrease in line with resolution. The team, led by Yves-Alexandre de Montjoye, developed a mathematical model that can be applied to location datasets, in the hopes that it will help pinpoint the balance between the quality of the mobility data, and the need for privacy of the users.

This is not the first experiment of this kind. The BBC reports that in 2006, AOL released anonymised information on 20 million searches which was deemed safe to privacy of the users, but the New York Times was able to put the pieces together and establish the identity of “searcher 4417749”.

“We think this data is more available than people think. When you think about, for instance, Wi-Fi or any application you start on your phone, we call up the same kind of mobility data,” de Montjoye told the BBC.

“When you share information, you look around you and feel like there are lots of people around – in the shopping centre or a tourist place – so you feel this isn’t sensitive information,” he added.

The team hopes its research will influence the debate about the safety of Big Data and privacy in the digital age.

Meanwhile, researchers from the University of Cambridge have claimed that sexual orientation, race, age, relationship status and even a history of substance abuse can be accurately deducted from unrelated things a user “likes” on Facebook.

Can you look after your personal data online? Take our quiz!

Read also :