Categories: Security

DefCon 2017: ‘Anonymous’ Browsing Data Easy To De-Anonymise

A pair of German researchers have shown just how easy it is to identify individuals and track their internet browsing habits in detail using supposedly ‘anonymised’ data sources.

The research, presented at DefCon in Las Vegas by journalist Svea Eckert and data scientist Andreas Dewes, sheds light on the practices of companies that collect user data, either for their own purposes or selling it on to third parties.

Defeating anonymisation

The data is, in theory, stripped of identifying information before being used, but Eckert and Dewes found that in some cases as few as 10 web addresses is enough to identify who the ‘clickstream’ belongs to.

Once the individual has been identified, often by matching particular information in the clickstream to data that’s publicly available, the stream indicates everything that user has been doing online, minute by minute, Eckert and Dewes said.

For instance, their experiment uncovered the porn habits of a judge and the drug preferences of a German MP.

They found details of an ongoing police investigation by examining Google Translate URLs, in which which are stored the full text of any query, after matching one clickstream to a particular police detective.

In many cases identity could be established by examining particular URLs – for instance, if someone logs into their own analytics page on Twitter an address is generated that contains their own username.

In other cases the clickstream might indicate the user visited a particular site, at a particular time, say, a YouTube video, when the individual has mentioned looking at that video on a publicly visible source such as a blog.

Nosy browser extensions

“The increase in publicly available information on many people makes de-anonymisation via linkage attacks easier than ever before,” the researchers said the presentation.

The data was surprisingly easy to obtain, with 95 percent of it coming from 10 popular browser extensions. Such extensions offer users a service, but also monitor everything they do online and use it themselves, for purposes such as targeting advertisements, or sell it to third parties.

They estimated up to 10,000 extensions collect detailed user data, but most have a relatively small user base.


“When thinking about surveillance, everyone worries about government agencies like the NSA and big corporations like Google and Facebook,” Eckert wrote in a blog post. “But actually there are hundreds of companies that have also discovered data collection as a revenue source… Most of them keep their data to themselves, some exchange it, but a few sell it to anyone who’s willing to pay,”

Social engineering

The researchers posed as a marketing company that wanted to buy browsing information to train its machine learning tools, and it took them about two weeks to obtain one month’s worth of browsing information on three million German users, compiling a database of 3 billion URLs spread across 9 million sites.

The information was so sensitive they deleted it after the investigation for fear it might fall into the hands of hackers. In her blog post Eckert said the way browser extension companies collect and resell user data is “often illegal” under European law.

A data broker provided Eckert and Dewes for free with information obtained from browser plugins including Web Of Trust (WOT), which, ironically, provides reviews of websites’ privacy practices.

After German public broadcasting network NDR published a report based on the study last November, WOT reworked its extensions and mobile app to better protect users’ anonymity, also giving users the ability to opt out of data collection.

But Eckert and Dewes said it’s next to impossible to make a clickstream fully anonymous.

“High-dimensional, user-related data is really hard to robustly anonymise, even if you really try to do so,” they said in the DefCon presentation.

Users who want to anonymise the clickstream themselves can use services such as TOR or a VPN with rotating exit nodes, or client-side software that blocks trackers, they said.

How much do you know about privacy? Try our quiz!

Matthew Broersma

Matt Broersma is a long standing tech freelance, who has worked for Ziff-Davis, ZDnet and other leading publications

Recent Posts

Ericsson To Cut 1,200 Jobs in Sweden Amid ‘Challenging’ Market

Swedish telecoms giant Ericsson blamed “challenging mobile networks market” and “further volume contraction” for job…

3 hours ago

FTX’s Sam Bankman-Fried Sentenced To 25 Years In Prison For $8bn Fraud

Dramatic downfall. Sam Bankman-Fried sentenced to 25 years in prison for masterminding $8bn fraud that…

4 hours ago

Elon Musk Orders FSD Demo For Every Tesla US Sale

Fallout avoidance? Tesla buyers in the US must be shown how to use the FSD…

5 hours ago

Amazon Pumps Another $2.75 Billion Into Anthropic

Amazon completes its $4bn investment into AI firm Anthropic, after providing an additional $2.75bn in…

7 hours ago

The Sustainability of AI

While AI promises unparalleled efficiency, productivity, and innovation, questions regarding its environmental impact loom large.…

10 hours ago

Trump’s Truth Social Makes Successful Market Debut

Shares in Donald Trump’s social media company rose about 16 percent after first day of…

10 hours ago