Is it illegal to scrape a website for content?

Web scraping is the process of using bots to extract information from a website. In recent years, the debate on web scraping becomes increasingly complex as business intelligence and data privacy issues arise.

The practice of web scraping has lasted almost as long as there have been websites. To be fair, there is “good” web scraping that, in fact, is a fundamental foundation of the internet. Here are some examples of practicing “good” web scraping:

  • The “good” search engine crawlers crawl websites to index, analyze and rank their content
  • Price comparison sites deploy bots to automatically research product prices and descriptions for allied seller websites, allowing consumers to compare prices of goods and services and make more informed purchasing choices
  • Market research companies use web scrapers to mine data from forums and social media to gauge public opinion (i.e., report on “what’s trending”).

This, however, is where the good part of the web scraping story ends. Bad bots, which according to Imperva Bad Bots Report 2022 accounted for 27.7% of all web, mobile and API traffic, an increase of 2.1% over the previous year, retrieve content from a website with the intention of use it for purposes beyond the control of the site owner. Apart from web scraping, cyber criminals use bad bots to conduct various harmful activities including denial of service attackscompetitive data mining, online fraud, account takeover, data theft, intellectual property theft, unauthorized vulnerability scans, spam and digital ad fraud.

The two main ways malicious actors use web scraping maliciously are lowering prices to gain an unfair competitive advantage and stealing copyrighted content and intellectual property. The question remains, is it illegal?

The case of LinkedIn and hiQ Labs

In the summer of 2017, LinkedIn sued hiQ Labs, a San Francisco-based startup. hiQ scrapes publicly available LinkedIn profiles to offer clients, according to its website, “a crystal ball that helps you identify skill gaps or turnover risks months in advance.”

The idea that your public LinkedIn profile could be used against you by your employer is quite troubling. However, on August 14, 2017, a judge ruled that everything was fine. Judge Edward Chen of the U.S. District Court in San Francisco accepted hiQ’s claim in a lawsuit that Microsoft-owned LinkedIn violated antitrust laws by blocking the startup from accessing that data. He ordered LinkedIn to remove the barriers within 24 hours. LinkedIn appealed.

The decision goes against previous court rulings that suggested cracking down on web scraping. And it raises myriad questions about the privacy of social media users and the right of businesses to protect themselves against data breaches. There is also the issue of fairness. LinkedIn has spent years creating something of real value. Why should he have to hand it over to hiQ – paying for servers and bandwidth to host all that bot traffic on top of their own human users, just so hiQ can surf LinkedIn?

The final word has yet to be spoken in the legal battle between LinkedIn and hiQ Labs, which describes itself as a “data science company, informed by public data sources, applied to human capital.” LinkedIn attempts to prevent hiQ from removing personal information from users’ public profiles. After the Ninth Circuit Court of Appeals’ ruling in favor of allowing bots to scrape publicly available content, LinkedIn filed its motion seeking Supreme Court review in March 2020. Indeed, in June 2021 , the Supreme Court gave LinkedIn another chance to stop hiQ. The Supreme Court, however, said it would not take up the case. Instead, he ordered the court of appeal to re-hear the case in light of its recent ruling, which found that a person cannot violate the Computer Fraud and Abuse Act (CFAA) if they improperly access data on a computer she is allowed to use.2 That’s not the only battle LinkedIn is currently fighting; in February 2022, LinkedIn filed a lawsuit against Singapore-based data scraper group Mantheos Pte. Ltd., Jeremiah Tang, Yuxi Chew and Stan Kosyakov. The complaint claims that they illegally profit from data scraping of the LinkedIn website, in violation of its terms of use and to the detriment of its users. The case continues.

What’s the verdict on web scraping?

As discussed here, the legality of web scraping is unsettled as website owners continue to pursue legal actions to prevent their sites from being scraped. As the courts attempt to further determine the legality of web scraping, you could likely have your data stolen and your website’s business logic abused. Instead of seeking legal remedies to overcome this technological challenge, consider solving it with advanced bot protection and anti-scraping technology today.

Imperva advanced bot protection protects your websites, mobile apps, and APIs from automated attacks without impacting mission-critical traffic flow. Learn more.

The post office Is it illegal to scrape a website for content? appeared first on Blog.

*** This is a syndicated blog from the Security Bloggers Network of Blog written by Bruce Lynch. Read the original post at: https://www.imperva.com/blog/is-it-illegal-to-scrape-a-website-for-content/

About Sandra A. Powell

Check Also

How Google’s Latest Anti-Spam Update Could Hurt Your Music Website

For musicians, having an easily accessible website is crucial. However, Google has recently tightened its …