Reddit sues Perplexity for scraping of posts, expanding user data battle with AI industry

5 days ago 5

Thomas Fuller | Lightrocket | Getty Images

Social media elephantine Reddit has launched a suit against artificial quality institution Perplexity, alleging that it illegally scraped idiosyncratic posts to bid its AI model, marking the latest data-rights clash betwixt contented owners and the AI industry. 

The ailment filed successful New York national tribunal connected Wednesday besides named 3 defendants, which Reddit says helped Perplexity cod its data: Lithuanian information scraper Oxylabs, "former Russian botnet" AWMProxy, and Texas startup SerpApi.

Reddit alleged that the 3 smaller entities were capable to extract its copyrighted contented "by masking their identities, hiding their locations and disguising their web scrapers arsenic regular people."

Perplexity, which runs an AI-powered hunt engine, denied the allegations and accused Reddit of "extortion" and absorption to an unfastened internet, portion SerpApi told CNBC it "strongly disagrees" with Reddit's claims and intends to support itself successful court. 

The lawsuit represents 1 of galore filed by contented owners accusing AI firms of utilizing copyrighted worldly without support to bid their ample connection models. Reddit, successful particular, has been connected the beforehand lines of that battle, having launched a akin ongoing lawsuit against AI startup Anthropic successful June. CNBC was incapable to scope Oxylabs and AWMProxy.

In a connection shared with CNBC, Ben Lee, Chief Legal Officer astatine Reddit, said that AI companies are" locked successful an arms contention for prime quality content" and that unit has fueled an "industrial-scale 'data laundering' economy."

Scrapers bypass technological protections to bargain data, past merchantability it to clients bare for grooming material. Reddit is simply a premier people due to the fact that it's 1 of the largest and astir dynamic collections of quality speech ever created.

Reddit — which hosts implicit 100,000 interest-based "subreddit" communities — said successful its suit that its idiosyncratic posts had go the astir commonly cited root for AI-generated answers connected Perplexity. 

It added that it sent Perplexity a cease-and-desist letter, aft which it "increased the measurement of citations to Reddit forty-fold."

AI researchers person antecedently noted that Reddit's ample measurement of moderated conversations tin assistance marque AI chatbots nutrient much natural-sounding responses.

In the property of artificial intelligence, Reddit has worked to leverage its monolithic information pool, permitting entree to it lone done AI-related licensing agreements. The societal media institution has signed specified agreements with OpenAI and Alphabet's Google. 

In a effect to the lawsuit, Perplexity, successful a station connected the Reddit platform, argued that it does not bid AI models connected contented but simply summarizes and cites nationalist Reddit discussions. Therefore, it said it is "impossible" to motion a licence agreement.

"A twelvemonth ago, aft explaining this, Reddit insisted we wage anyway, contempt lawfully accessing Reddit data. Bowing to beardown limb tactics conscionable isn't however we bash business," the connection read, going connected to picture the suit arsenic a "show of unit successful Reddit's grooming information negotiations with Google and OpenAI." 

"Perplexity believes this is simply a bittersweet illustration of what happens erstwhile nationalist information becomes a large portion of a nationalist company's concern model," Perplexity added, noting that information licensing has go an progressively important root of gross for Reddit. 

In February, Reddit's COO Jen Wong told the trade work Adweek that AI licensing deals with Google and OpenAI made up astir 10% of Reddit's revenue. 

Read Entire Article