Last week, a A New York federal judge ruled a copyright infringement claim by The Intercept against OpenAI would move forward in court. The move is the latest in a series of major legal rulings involving the AI developer this month, after OpenAI sought to dismiss lawsuits filed by several digital news publishers.
Judge Jed Rakoff said he would hear the claim that OpenAI paternity information removed when he allegedly fed The Intercept articles into the training datasets he used to create ChatGPT. This could constitute a violation of the Digital Millennium Copyright Act (DMCA)a 1998 law that, among other protections, prohibits the removal of the author’s name, terms of use, or title from a digital work.
The judge rejected The Intercept’s claim that OpenAI had knowingly distributed copies of its articles after removing the information protected by DMCA. The judge also dismissed all of The Intercept’s claims against Microsoft, which has a several billion dollars investment in OpenAI and was named in the initial filing. An opinion from the judge, setting out the reasons for these dismissals, will be published in the coming weeks.
“The ruling allows a DMCA complaint to be filed on behalf of digital publishers who do not have copyright registration to sue OpenAI,” said Mat Subjectpartner of Loevy & Loevy, which represents The Intercept. “We’re obviously disappointed to lose the claims against Microsoft, but the main claim is the DMCA claim against OpenAI, and we’re very happy to see that moving forward.”
“Our models are trained on publicly available data, based on fair use and related principles that we consider fair to creators,” OpenAI spokesperson Jason Deutrom said in a statement.
Earlier this year I reported that The Intercept’s case developed a new legal strategy for digital news publishers to sue OpenAI.
The New York Times lawsuit against OpenAI and similar lawsuits filed by The New York Daily News And Mother Joneslead with copyright infringement claims. Infringement prosecutions require that the affected works first be registered with the US Copyright Office (USCO). But most digital news publishers have not saved their article archives. For many, including The Intercept, filing all of their Internet-published work with USCO is too costly or burdensome.
Until this summer, the government agency required each article page on the website to be filed and billed separately. But in August, the USCO added a rule which allows “information sites” to submit articles in bulk. Among other reasons, the ruling cited concerns about unchecked infringement of online news content and the hope that copyright registrations remain “adaptive to technological changes.” But for most digital news publishers seeking to take legal action against OpenAI, including for using their work to form ChatGPT, the new rule came too late.
So far, The Intercept case is the only litigation brought by a newspaper publisher, not related to copyright infringement, to move beyond the motion to dismiss stage.
Earlier this month, legal strategy focused on the DMCA took a hard hit when another federal judge in New York dismissed all DMCA claims against OpenAI filed by Raw Story and AlterNet. Progressive digital news sites are jointly represented by Loevy & Loevy.
“Let’s be clear about what’s really at stake here. The alleged harm for which Plaintiffs are truly seeking relief is not the exclusion (of content management information) from Defendants’ training sets, but rather Defendants’ use of Plaintiffs’ articles to develop ChatGPT without compensation “, wrote Judge Colleen MacMahon. in this decision.
Despite the setback, the judge said she would consider an amended complaint against OpenAI that would take into account her concerns. An amended proposed complaint by Raw Story and AlterNet was filed by Loevy & Loevy last week, just before The Intercept’s decision was announced.
“When filling their training sets with works of journalism, defendants had a choice: They could train ChatGPT using works of journalism with the DMCA copyright management information intact, or they could delete them. The accused chose the latter solution,” we read in the proposed amended complaint. “In the process, (OpenAI) trained ChatGPT not to recognize or respect copyright, not to inform ChatGPT users when responses they received were copyrighted by journalists, and not to provide attribution when using the works of human journalists. »
Like The Intercept, Raw Story and AlterNet are seeking $2,500 in damages for each instance where OpenAI removed DMCA-protected information from its training datasets. If damages are calculated based on each individual article allegedly used to form ChatGPT, this could quickly reach tens of thousands of violations.
“The proposed amended complaint would match and likely even go beyond the allegations that have survived in The Intercept case,” Topic said. “Different judges might rule differently on the same issue, but we are optimistic that we will have the opportunity to proceed with an amended request. »
It’s unclear whether the Intercept decision will encourage other publications to consider DMCA litigation; So far, few publications have followed in their footsteps. Over time, there are concerns that further lawsuits against OpenAI could be vulnerable to statute of limitations restrictions, particularly if news publishers wish to cite the training datasets underlying ChatGPT. But the ruling is a sign that Loevy & Loevy is focusing on a specific DMCA claim that can actually stand up in court.
“We think the claim that has survived for The Intercept is one that most digital publishers would also be able to make,” Topic said.
Unsplash