Sam Altman, CEO of OpenAI, during a panel session at the World Economic Forum in Davos, Switzerland, on January 18, 2024.

Stefan Wermuth | Bloomberg | Getty Images

Eight American newspaper publishers filed suit against Microsoft and OpenAI in federal court in New York on Tuesday, alleging that the tech companies are reusing their articles without permission in generative artificial intelligence products and improperly attributing inaccurate information to them.

The legal challenge comes four months after The New York Times sued OpenAI for copyright infringement in the ChatGPT chatbot, which the startup launched in late 2022. OpenAI said in January blog post that the case is without merit, adding that it wants to support a “healthy news ecosystem.” Sam Altman, CEO of OpenAI, said in January that the startup wanted to pay The New York Times and was surprised to learn of the lawsuit.

In recent months, OpenAI has signed deals with a handful of media companies, including Axel Springer and Financial Timesallowing the Microsoft-backed startup to use publishers’ content to improve AI models. Googlewhich has its own general purpose chatbot to answer user queries, said in February, with which it reached an agreement Reddit which includes the right to train AI models on the content of the platform.

The group of eight newspaper publishers is challenging ChatGPT and Microsoft’s Copilot assistant — available in the Windows operating system, the Bing search engine and other products produced by the software maker — for “appropriating millions of the publishers’ authored articles without permission and without payment, According to the appeal.

Representatives for Microsoft and OpenAI did not immediately respond to requests for comment. The newspaper publishers in the suit run The New York Daily News, The Chicago Tribune, The Orlando Sentinel, The Sun-Sentinel of Florida, The Mercury News of California, The Denver Post, The Orange County Register of California and The Pioneer Press of Minnesota.

They said OpenAI used datasets containing text from their newspapers to train its large language models GPT-2 and GPT-3, which can spit out text in response to a few words of human input.

“The current GPT-4 LLM will output near-verbatim copies of substantial portions of the publishers’ works when prompted to do so,” the complaint said, showing several examples of ChatGPT and Copilot allegedly doing so.

The publishers said Microsoft copied information from their newspapers for the Bing search index, which helps inform responses on Copilot. But such a result does not always provide links to newspaper websites where they can view advertisements alongside articles or pay for subscriptions.

The New York Times case also touched on the issue of OpenAI models scraping information from its articles. In its blog post, OpenAI characterized such behavior as “a rare failure of the learning process in which we are constantly making progress.”

WATCH: OpenAI CEO Sam Altman: US needs AI policy