SummaryPlaintiffs have brought a class action against Meta relating to its LLaMA (Large Language Model Meta AI) product in the US District Court for the Northern District of California. The claim notes Meta's statements that LLaMa was trained using books including from the Books3 section of ThePile dataset (assembled from content available in 'shadow library' websites (including Bibliotik)), which the Plaintiffs content includes their copyright works.
The claims (as originally drafted) included direct and vicarious copyright infringement, violations of the DMCA, violations of California unfair competition law, negligence and unjust enrichment.
Meta filed a Motion to Dismiss parts of the claim – the Motion to Dismiss only applies partially to the claim of direct infringement. On this, Meta's Motion states: "Use of texts to train LLaMA to statistically model language and generate original expression is transformative by nature and quintessential fair use—much like Google’s wholesale copying of books to create an internet search tool was found to be fair use in Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015)." Clearly, the issue of fair use is going to be central to this debate.
On Thursday 9 November 2023, US District Judge Vince Chhabria indicated that he would grant Meta's motion to dismiss the claims that content generated by Meta's LLaMA tool infringes their copyright (and also that LLaMA is itself an infringing work), but would give the plaintiffs permission to amend most of their claim.
On 11 December 2023, the Plaintiffs filed their amended Complaint, on the basis of direct copyright infringement.
The Plaintiffs filed a Second Amended Complaint on 9 September 2024.
Meta had sought to challenge its CEO Mark Zuckerberg being deposed but the Court denied its motion on 24 September 2024. The Plaintiffs had established that he was the chief decision maker and policy setter for Meta's generative AI brand and the development of the large language models at issue in the action.
The claim has been consolidated with that brought by a number of authors including Michael Chabon, and also with the Huckabee action against Meta which has been transferred from the US District Court for the Southern District of New York to the US District Court for the Northern District of California.
In December 2024, the Plaintiffs filed a Motion to file a Third Amended Consolidated Complaint, which was granted in January 2025. The Plaintiffs brought the Motion on the basis that Meta had produced "some of the most incriminating internal documents it has produced to date" shortly before the end of the discovery deadline. The Third Amended Consolidated complaint includes new claims under the California Comprehensive Computer Data Access and Fraud Act and DMCA, as well as copyright infringement claims relating to seeding of the Plaintiffs' works during an alleged process by Meta of torrenting pirated files from the LibGen dataset.
Meta has filed a Motion to Dismiss the Third Amended Consolidated Complaint, arguing that the case should be focused on the fair use arguments as opposed to the Plaintiffs' attempts to 'distract' from that core issue with their new claims. The Plaintiffs have responded that the new claims are predicated on facts that strike at the heart of Meta's 'fair use' defence.
On 7 March 2025, Judge Chhabria granted the Motion to Dismiss in relation to the CDAFA (California Comprehensive Computer Data Access and Fraud Act) claim but denied the Motion as to the DMCA claim relating to removal of copyright management information, finding that the Plaintiffs had alleged a sufficient injury for Article III standing. On 10 March 2025, the Plaintiffs filed a Motion for Partial Summary Judgment on direct copyright infringement and on the ground that Meta's "initial acquisition of millions of pirated works cannot be fair use". Other aspects of the claim, including in relation to whether fair use applies to Meta's alleged infringements during and after the LLM training process, do not form part of the summary judgment motion.
Meta has responded to the Motion for Partial Summary Judgment and has itself sought summary judgment that its copying of the Plaintiffs' works to develop and train LLMs is fair use and on the DMCA claim. A number of amicus curiae briefs have been filed in support of both parties' arguments on the summary judgment motions.
The respective Summary Judgment Motions were heard on 1 May 2025. Prior to the hearing, the Judge shared a non-exclusive list of 12 questions for the parties to consider, in particular in relation to the fair use arguments, and also focusing on downloads of pirated works for the use of training of AI models.
On 25 June 2025, Judge Chhabria issued his order on fair use in favour of Meta. Whilst he denied the Plaintiffs' motion for partial summary judgment, and granted Meta's cross-motion, the discussion re fair use is more nuanced than that outcome would suggest. The Judge found that Meta's use of the works was highly transformative which meant that the plaintiffs needed to win decisively on the fourth fair use factor, market dilution, in order to win on fair use. On this point, the Judge notes that "in cases involving use like Meta's, it seems the plaintiffs will often win, at least where those cases have better-developed records on the market effects of the defendant's use. No matter how transformative LLM training may be, it's hard to imagine that it can be fair use to use copyrighted books to develop a tool to make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing works that could significantly harm the market for those books". He also suggests that some markets (eg news articles) might be even more vulnerable to indirect competition from AI outputs. However, in other situations, there may be fair use, including where the plaintiffs are unlikely to face meaningful competition from AI-generated ones.
The Judge concludes that because the issue of market dilution was so important, if the plaintiffs had presented evidence on this that could be used by a jury to decide in their favour on this issue, this aspect of the case would have to go forward to a jury. However, because they had not done so, he decided the fair use arguments in favour of Meta.
It is also worth noting that the Judge disagrees with the opinion expressed by Judge Alsup in the Bartz v Anthropic summary judgment decision (see further below) and his heavy focus on the transformative nature of generative AI, describing him as "brushing aside concerns about the harm it can inflict on the market for the works it gets trained on", criticising also Judge Alsup's analogy of using the works for "training schoolchildren to write well".