Categories: Security

DefCon 2017: ‘Anonymous’ Browsing Data Easy To De-Anonymise

A pair of German researchers have shown just how easy it is to identify individuals and track their internet browsing habits in detail using supposedly ‘anonymised’ data sources.

The research, presented at DefCon in Las Vegas by journalist Svea Eckert and data scientist Andreas Dewes, sheds light on the practices of companies that collect user data, either for their own purposes or selling it on to third parties.

Defeating anonymisation

The data is, in theory, stripped of identifying information before being used, but Eckert and Dewes found that in some cases as few as 10 web addresses is enough to identify who the ‘clickstream’ belongs to.

Once the individual has been identified, often by matching particular information in the clickstream to data that’s publicly available, the stream indicates everything that user has been doing online, minute by minute, Eckert and Dewes said.

For instance, their experiment uncovered the porn habits of a judge and the drug preferences of a German MP.

They found details of an ongoing police investigation by examining Google Translate URLs, in which which are stored the full text of any query, after matching one clickstream to a particular police detective.

In many cases identity could be established by examining particular URLs – for instance, if someone logs into their own analytics page on Twitter an address is generated that contains their own username.

In other cases the clickstream might indicate the user visited a particular site, at a particular time, say, a YouTube video, when the individual has mentioned looking at that video on a publicly visible source such as a blog.

Nosy browser extensions

“The increase in publicly available information on many people makes de-anonymisation via linkage attacks easier than ever before,” the researchers said the presentation.

The data was surprisingly easy to obtain, with 95 percent of it coming from 10 popular browser extensions. Such extensions offer users a service, but also monitor everything they do online and use it themselves, for purposes such as targeting advertisements, or sell it to third parties.

They estimated up to 10,000 extensions collect detailed user data, but most have a relatively small user base.

“When thinking about surveillance, everyone worries about government agencies like the NSA and big corporations like Google and Facebook,” Eckert wrote in a blog post. “But actually there are hundreds of companies that have also discovered data collection as a revenue source… Most of them keep their data to themselves, some exchange it, but a few sell it to anyone who’s willing to pay,”

Social engineering

The researchers posed as a marketing company that wanted to buy browsing information to train its machine learning tools, and it took them about two weeks to obtain one month’s worth of browsing information on three million German users, compiling a database of 3 billion URLs spread across 9 million sites.

The information was so sensitive they deleted it after the investigation for fear it might fall into the hands of hackers. In her blog post Eckert said the way browser extension companies collect and resell user data is “often illegal” under European law.

A data broker provided Eckert and Dewes for free with information obtained from browser plugins including Web Of Trust (WOT), which, ironically, provides reviews of websites’ privacy practices.

After German public broadcasting network NDR published a report based on the study last November, WOT reworked its extensions and mobile app to better protect users’ anonymity, also giving users the ability to opt out of data collection.

But Eckert and Dewes said it’s next to impossible to make a clickstream fully anonymous.

“High-dimensional, user-related data is really hard to robustly anonymise, even if you really try to do so,” they said in the DefCon presentation.

Users who want to anonymise the clickstream themselves can use services such as TOR or a VPN with rotating exit nodes, or client-side software that blocks trackers, they said.

How much do you know about privacy? Try our quiz!

Matthew Broersma

Matt Broersma is a long standing tech freelance, who has worked for Ziff-Davis, ZDnet and other leading publications

Recent Posts

Google Ordered To Pay $43m By Australian Court

Search engine Google fined $43 million by Australian court for tracking Android users location data…

2 days ago

Hacker Touts Data Sale Of 48.5m Users Of Covid App – Report

Personal data of 48.5 million Chinese citizens who used Shanghai's Covid App, is being offered…

2 days ago

Facebook Tests Default End-to-End Encryption For Messenger

Privacy move. Platform tests secure storage of people's chats on Messenger, in a move sure…

2 days ago

UK’s CMA Begins Probe Of Viasat Acquisition Of Inmarsat

British competition regulator the CMA, begins phase one investigation of $7.3 billion merger between Inmarsat…

2 days ago

Cisco Admits ‘Security Incident’ After Breach Of Corporate Network

Yanluowang ransomware hackers claim credit for compromise of Cisco's corporate network in May, while Cisco…

2 days ago

Google Seeks To Shame Apple Over RCS Refusal

Good luck convincing Tim. Google begins publicity campaign to pressure Aple into adopting the cross…

3 days ago