US Court Rules Data Scraping On Public Websites Is Legal


User privacy blow. Scraping data from online LinkedIn profiles is legal, Appeal Court in the United States tells Microsoft

A US appeal court has waded into a controversial online practice that for many people and organisations has big privacy implications going forward.

On Monday, the US Ninth Circuit of Appeals ruled that web scraping of publicly accessible data is legal. This is a blow to Microsoft and its LinkedIn unit, which had challenged the process, saying the practice endangers user privacy.

Data scraping is a sensitive subject for Microsoft, as in June last year LinkedIn was forced to deny it had suffered a data breach, after the personal data of 700 million LinkedIn users was found for sale on the dark web.

LinkedIn Feed

Data scraping

LinkedIn, it should be remembered, had been at the centre of data breach allegation in April 2021, when an archive containing data scraped from 500 million LinkedIn profiles was put for sale on a popular hacker forum, with another 2 million records leaked as a proof-of-concept sample by the post author.

The data found for sale on the dark web included full names; email addresses; phone numbers; physical addresses; geolocation records; LinkedIn username and profile URL; personal and professional experience/background; genders; and other social media accounts and usernames.

LinkedIn however last year denied that this data for sale on the dark web was a result of a data breach, but was rather down to web scraping.

“We want to be clear that this is not a data breach and no private LinkedIn member data was exposed,” said the network in June 2021.

“Our initial investigation has found that this data was scraped from LinkedIn and other various websites and includes the same data reported earlier this year in our April 2021 scraping update.”

Microsoft challenge

LinkedIn has been trying to tackle web scraping for a while now.

In 2017, it sent a cease and desist letter to the CEO of HiQ Labs, which according to Forbes, uses data scraped from public sections of LinkedIn to create reports for corporate customers, identifying which of their employees are most likely to quit and which are most likely to be targeted by recruiters.

Microsoft’s cease and desist letter warned HiQ that the business-focused social network had implemented “technical measures” to prevent the company from accessing the site.

After LinkedIn sent its cease-and-desist letter in 2017, HiQ reportedly asked the US District Court for the Northern District of California to issue an injunction preventing LinkedIn from interfering with its data-scraping practices, or “misusing the law to destroy HiQ’s business.”

According to Forbes, in 2019 Appeals Court issued a preliminary injunction stopping LinkedIn from preventing HiQ Labs from accessing publicly visible LinkedIn member profiles.

Microsoft petitioned the Appeals Court ruling to the Supreme Court, asking them to review the decision.

However Forbes reported the Supreme Court refused to hear the case, and instead ordered the Appeals Court to vacate its previous ruling and reconsider the case.

And this brings us to this week, when on Monday, the Appeals Court upheld its 2019 decision.

It cited the risk of HiQ going out of business if blocked from scraping LinkedIn data.

A LinkedIn spokesperson described the Appeals Court ruling as disappointing.

And a LinkedIn spokesperson indicated that the company intends to keep pursuing the case, remarking the case is “far from over.”

HiQ did not immediately respond to a request for comment, Forbes reported.

Logical decision

One expert said the Appeals Court decision was the only logical outcome of the challenge.

“In truth, the Court ruled that scraping of publicly available data (if data is stored in a way that it is in Linkedin’s case) is legal in the light of the Computer Fraud and Abuse Act (CFAA). Nothing more, nothing less,” said Denas Grybauskas, head of legal at Oxylabs, a provider of public web data acquisition solutions and proxies.

“It’s definitely a great decision for the scraping industry, however, it just reaffirmed what probably the majority of those in the tech industry already knew: scraping of public data and hacking shouldn’t be treated as the same because these actions are completely different and should have entirely different legal implications,” said Grybauskas.

“The US Court of Appeals for the Ninth Circuit once again came to the only logical conclusion here that scraping of publicly available data does not breach the Computer Fraud and Abuse Act (CFAA),” said Grybauskas. “Many other scraping related questions in this battle of HiQ labs and Linkedin remain to be answered (such as the alleged breach of Linkedin’s terms of service, privacy laws implications, etc).”

“We are happy for this reasonable decision as a different ruling could have brought terrible consequences to the whole industry,” said Grybauskas. “But this decision does not mean that those that scrape data should go berserk from now on.”

“One must always first evaluate the type of data they are planning to scrape,” said Grybauskas. “Consider what kind of other legal questions might need to be answered before starting to gather the data. And, like always, scraping ethics and well-being of the scraped targets should always be taken into account.”