AI company Anthropic not only illegally downloaded copyrighted music lyrics, it also uploaded them to other users, music publishers allege in a new court filing.
Publishers including Universal Music Group, Concord and ABKCO sued Anthropic in 2023 for copyright infringement, alleging that its Claude chatbot regurgitated copyrighted lyrics, indicating the company had trained the chatbot on those lyrics without permission.
Now the music publishers additionally allege that Anthropic hid the fact that it used BitTorrent to pirate copyrighted lyrics, lawyers for the publishers said in a document filed on Monday (August 11) with the US District Court for the Northern District of California.
Instead, the publishers found out about it through a separate copyright lawsuit against Anthropic. A number of book authors sued Anthropic in 2023, alleging that Claude had been trained on their books without permission, and evidence of Anthropic using BitTorrent was presented in that case.
In a ruling in that case in June, Judge William Alsup of the same district court ruled that Anthropic’s unauthorized use of books to train its AI is “fair use” under US copyright law – but pirating books through illicit websites is not. The judge ordered Anthropic to stand trial for piracy in December.
(This past Monday, the judge rejected Anthropic’s motion to stay the case while it appeals the ruling.)
“Inexplicably, Anthropic never disclosed to publishers in this case that it had used BitTorrent to copy books containing their works from pirate sites in this manner, despite publishers’ discovery requests calling for exactly this type of information,” lawyers for the music publishers wrote.
The lawyers asked Judge Eumi K. Lee for leave to amend their complaint against Anthropic to include the new allegations about BitTorrent, and to reschedule future court hearings so that they have time to investigate the matter.
The court filing suggests that the music publishers could add a new charge against Anthropic: distributing copyrighted lyrics without a license.
“Anthropic never disclosed to publishers in this case that it had used BitTorrent to copy books containing their works from pirate sites in this manner, despite publishers’ discovery requests calling for exactly this type of information.”
Lawyers for Concord, ABKCO, UMG
BitTorrent is a decentralized file-sharing system in which anyone who downloads a file also uploads parts of that file to other users, meaning that Anthropic would have also uploaded the lyrics to other users engaged in piracy, and in so doing, violated publishers’ exclusive distribution rights for those lyrics.
In the case brought by book authors against Anthropic, Judge Alsup found that Anthropic torrented 5 million files from the pirate online library LibGen, 2 million files from Pirate Library Mirror (PiLiMi), and nearly 200,000 records in the Books3 collection.
Lawyers for the music publishers pointed out that the LibGen catalog of pirated books includes numerous books of song lyrics and sheet music, including works at issue in the copyright infringement case.
LibGen “contains well over a thousand illegal copies of sheet music, songbooks, and other lyric-related books,” the music publishers’ lawyers wrote in the court filing, which can be read in full here.
“These include numerous standalone copies of sheet music and lyrics to publishers’ works [involved in the lawsuit] specifically, such as Tiny Dancer (written by Elton John and Bernie Taupin), A Thousand Miles (written by Vanessa Carlton), and 7 Rings (recorded by Ariana Grande).”
Anthropic is not the only AI developer that stands accused of using mass piracy techniques to gather the training data for its AI models.
In a congressional hearing last month, led by Missouri Republican Sen. Josh Hawley, a copyright law expert alleged that Meta, the parent company of Facebook and Instagram and developer of the AI tool Llama, used BitTorrent to collect data with the knowledge of CEO Mark Zuckerberg.
Intellectual property lawyer Maxwell Pritt of Boies Schiller Flexner LLP told the hearing that the US’s leading AI companies engaged in “what is likely the largest domestic piracy of intellectual property in our nation’s history. That piracy includes hundreds of terabytes of data and many millions of works, including, for example, at least 12 books authored by members of this subcommittee.”Music Business Worldwide