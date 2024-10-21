Media baron Rupert Murdoch's Dow Jones and New York Post filed a lawsuit against Perplexity AI on Monday, claiming the AI startup engages in a "massive amount of illegal copying" of their copyrighted work.

The lawsuit is the latest salvo in a bitter ongoing battle between publishers and tech companies over how the latter may use copyrighted content without authorization to build and operate their AI systems.

Perplexity's search tools enable users to get instant answers to questions with sources and citations. It is powered by a variety of large language models (LLMs) that can sum up and generate information, from OpenAI to Meta's open-source model Llama. "This suit is brought by news publishers who seek redress for Perplexity's brazen scheme to compete for readers while simultaneously freeriding on the valuable content the publishers produce," read the lawsuit filed in the Southern District Of New York. Wall Street Journal parent Dow Jones and NY Post are owned by Murdoch's News Corp. Perplexity did not immediately respond to an email from Reuters seeking comment.

In the suit, the news publishers say their journalists investigate and write stories under tight deadlines and unpredictable circumstances. There is high demand for high-quality news presented in a timely, digestible format, they argue. These publications rely on the sale of advertising and subscriptions to underwrite the cost of journalism.

The news organizations allege Perplexity’s AI-generated “answer machine” has ingested its copyrighted news stories, analysis and opinion in an internal database used to generate responses to users’ queries. Its responses act as a substitute for other news and information sources – touting the fact that its answers are so reliable users can “skip the links.”

In the quest to provide answers, Dow Jones and the New York Post allege Perplexity copied "vast" quantities of its work into a database, which uses an AI technique known as retrieval-augmented generation (RAG) to provide answers to users' queries.

Perplexity formulates its responses in a way that at times reproduce the content, verbatim, the news organizations claim. The suit alleges these actions constitute an unlawful copyright infringement.

"Perplexity perpetrates an abuse of intellectual property that harms journalists, writers, publishers and News Corp," News Corp CEO Robert Thomson said in a statement.

"The perplexing Perplexity has willfully copied copious amounts of copyrighted material without compensation, and shamelessly presents repurposed material as a direct substitute for the original source. Perplexity proudly states that users can 'skip the links' - apparently, Perplexity wants to skip the check," he said. Dow Jones and the New York Post are asking the court to stop Perplexity from using its news articles as the basis for providing answers to questions, and to order the destruction of any database using its copyrighted work. With its lawsuit, News Corp is joining the ranks of multiple publishers that have sued AI companies for copyright infringement over their use of content without authorization, both to train algorithms and to generate summaries of real-time information. Earlier this month, New York Times sent Perplexity a "cease and desist" notice demanding it to stop using the newspaper's content for generative AI purposes. Perplexity has also faced accusations from media organizations such as Forbes and Wired for plagiarizing their content, but has since launched a revenue-sharing program to address some concerns put forward by publishers. Some publishers are signing licensing agreements with AI companies open to paying for content, although the sides often disagree over the value of the materials. Many AI developers argue they have broken no laws in accessing them for free. In May, News Corp announced it had struck a multi-year partnership with OpenAI, with Thomson applauding the tech company for understanding "that integrity and creativity are essential" to realize the potential of artificial intelligence.

While Perplexity has drawn the most scrutiny for its RAG practices, it is not alone among AI companies in circumventing a common web standard used by publishers to block the scraping of their content, content licensing startup TollBit told publishers over the summer.