Mishcon de Reya page structure
Site header
Main menu
Main content section

Generative AI – Intellectual property cases and policy tracker

Case tracker

With businesses in various sectors exploring the opportunities arising from the explosion in generative AI tools, it is important to be alive to the potential risks. In particular, the use of generative AI tools raises several issues relating to intellectual property, with potential concerns around infringements of IP rights in the inputs used to train such tools, as well as in output materials. There are also unresolved questions of the extent to which works generated by AI should be protected by IP rights. These issues are before the courts in various jurisdictions, and are also the subject of ongoing policy and regulatory discussions.

In this tracker, we provide an insight on the various intellectual property cases relating to generative AI going through the courts (focusing on a series of copyright cases in the US and UK), as well as anticipated policy and legislative developments.

Read more in our Guides to Generative AI & IP and to the use of Generative AI generally.

Please sign up to receive regular updates.

Subscribe

This page was last updated on 7 June 2024.

Court Cases

9 May 2024

Makkai v Databricks, Inc

Rebecca Makkai and Jason Reynolds v Databricks, Inc., and Mosaic ML, Inc.

US

Case: 4:24-cv-02653

Complaint: 2 May 2024

Answer to Complaint by Databricks, Inc, Mosaic LM, Inc: 29 May 2024

Summary

This class action complaint has been issued in the US District Court Northern District of California by two authors (Rebecca Makkai and Jason Reynolds) against MosaicML and its parent company Databricks.  Makkai owns registered copyrights in a number of books including The Hundred Year House, while Reynolds owns registered copyrights in books including As Brave as You.

The plaintiffs allege that their copyright works were included in the training dataset for MosaicML Pretrained Transformer (MPT) a series of large language models created by MosaicML and distributed by Databricks (including MPT-7B launched in May 2023, and MPT-30B launched in June 2023).  MosaicML has noted that a large quantity of data in the MPT training datasets comes from a component dataset called "RedPajama – Books". The complaint asserts that this is hosted on the Hugging Face website and its Books component is a copy of the Books3 dataset, which is itself a component of The Pile, which is derived from the Bibliothik shadow library comprising approximately 196,640 books.  The complaint against MosaicML is for direct copyright infringement.  The complaint against Databricks is for vicarious infringement (Databricks having acquired MosaicML in July 2023).

Impact

This is a further case brought by authors in relation to the use of their copyright works in training datasets for AI models (see also below O'Nan v Databricks and MosaicML).

9 May 2024

Dubus v Nvidia

Andre Dubus III and Susan Orlean v Nvidia Corporation

US

Case: 4:24-cv-02655

Complaint: 2 May 2024

Order relating case to Nazemian v Nvidia: 29 May 2024 

Summary

This class action complaint has been issued in the US District Court Northern District of California by two authors owning registered copyrights in certain books that were alleged to be included in the training dataset Nvidia used to train its NeMo Megatron models, released in September 2022.  The complaint alleges that each of the NeMo Megatron models is hosted on a website called Hugging Face and each has a model card that provides information about the model, including its training dataset – for each of the NeMo Megatron models, the model card states that "the model was trained on 'The Pile' dataset prepared by Eleuther AI" (which includes the Book3 dataset, derived from the Bibliothik shadow library). The complaint is for direct copyright infringement.

Impact

This is a further case brought by authors in relation to the use of their copyright works in training datasets for AI models (see also below Nazemian v Nvidia).

1 May 2024

Daily News v Microsoft and OpenAI

Daily News, L.P., Chicago Tribune Company, LLC, Orlando Sentinel Communications Company, LLC, Sun-Sentinel Company, LLC, San Jose Mercury-News, LLC, DP Media Network, LLC, ORB Publishing, LLC, and Northwest Publications, LLC v  Microsoft Corporation, OpenAI, Inc., OpenAI LP, OpenAI GP, LLC, OpenAI, LLC, OpenAI Opco, LLC, OpenAI Global, LLC, OAI Corporation, LLC and OpenAI Holdings LLC

US

Case: 1:24-cv-03285

Complaint: 30 April 2024

Summary

This complaint has been issued in the US District Court Southern District of New York by a number of regional and local newspapers (such as the New York Daily News and Chicago Tribune) and their publishers against OpenAI and Microsoft.

As with the complaint brought by The New York Times, examples are given of the GPT LLMs having 'memorised' copies of training data, as well as alleged hallucinations.  The complaint is for direct copyright infringement, vicarious copyright infringement, contributory copyright infringement (including in relation to sorrend users, to the extent end users are liable as direct infringers), removal of copyright management information, common law unfair competition by misappropriation, trade mark dilution (in branding outputs generated by OpenAI's GPT-based products), and dilution and injury to business reputation.  

Impact

Describing themselves as a 'rare breed in America' in terms of providing local news coverage, the Plaintiffs cite the new threat posed to them by GenAI products.  But, they also stress that this this is not a battle between new and old technology but one that is based on alleged use of copyrighted newspaper content, without their consent and without what they see as fair payment.  The case should be tracked alongside the complaint brought by The New York Times.

30 April 2024

Zhang v Google LLC

Jingha Zhang, Sarah Andersen, Hope Larson and Jessica Fink v Google LLC and Alphabet Inc.

US

Case: 5:24-cv-02531

Complaint against Alphabet and Google: 26 April 2024

Summary

This class action complaint has been brought by a number of visual artists against Google (and its parent company Alphabet) in relation to its text-to-image diffusion models Imagen (announced in May 2022 but not immediately released to the public), Imagen 2 (released in December 2023) and multi-modal models trained on both images and text (such Google Gemini).  The complaint is (only) for direct copyright infringement against Google and vicarious copyright infringement against Alphabet.  The complaint is based on an argument that the key source of Google's training data is the LAION image datasets.

Impact

This is the latest claim brought by visual artists, and includes as one of the Plaintiffs Sarah Andersen, who is a Plaintiff in the action against StabilityAI and other text-to-image models.

20 March 2024

Nazemian v Nvidia

Abdi Nazemian, Brian Keene and Stewart O'Nan v Nvidia Corporation

US

Case: 5:24-cv-01454

Complaint: 8 March 2024

Answer to Complaint by Nvidia: 24 May 2024

Order relating case to Dubus v Nvidia: 29 May 2024

Summary

In this class action complaint filed by three authors against Nvidia in the US District Court Northern District of California San Francisco Division, the Plaintiffs have brought a claim of direct copyright infringement against Nvdia relating to its NeMo Megatron LLM series released in September 2022.

The complaint alleges that the Plaintiff's registered copyrights were included in the training dataset used by Nvidia to develop its models. Each of the models is hosted on a website called Hugging Face, with a model card that provides information about the model, including its training dataset, in which it is stated that the model was trained on 'The Pile' dataset prepared by EleutherAI (the complaint therefore alleges that the LLM series was trained on one or more of the Plaintiffs' works).

Impact

This is another case relating to 'The Pile', one component of which is alleged to be a collection of books called Books3, derived from the Bibliotek 'shadow library' website. According to the complaint, the Books3 dataset was removed from Hugging Face in October 2023.

19 March 2024

O'Nan v Databricks

Stewart O'Nan, Abdi Nazemian and Brian Keene v Databricks, Inc., and MosaicML, Inc.

US

Case 3:24-cv-01451

Complaint: 8 March 2024

Answer to Complaint: 2 May 2024

Summary

In this class action filed by three authors against MosaicML (and its parent company Databricks) in the US District Court Northern District of California San Francisco Division, the Plaintiffs have brought a claim of direct copyright infringement relating to the training of MosaicML's Pretrained Transformer (MPT) models including MPT-7B and MPT-30B.  The complaint alleges that the MPTs were trained on a large quantity of data taken from a component dataset called 'RedPajama – Books' which was a dataset hosted on Hugging Face and in respect of which the 'Books' component is a copy of the "Books3 dataset", which his itself a component of The Pile dataset.  The complaint also alleges vicarious infringement against Databricks.

Impact

This is another case relating to 'The Pile', one component of which is alleged to be a collection of books called Books3, derived from the Bibliotek 'shadow library' website.

19 February 2024

In re ChatGPT Litigation: Tremblay v OpenAI (consolidated with Silverman v OpenAI and Chabon v OpenAI)

(1) Paul Tremblay & (2) Mona Awad v (1) OpenAI, Inc.; (2) OpenAI, L.P.; (3) OpenAI Gp, L.L.C., (4) OpenAI Opco, L.L.C. (5) OpenAI Startup Fund Gp I, L.L.C.; (6) OpenAI Startup Fund I, L.P.;(7) OpenAI Startup Fund Management, LLC 

US

Case 3:23-cv-03223

Complaint: 28 June 2023

Motion to dismiss by OpenAI: 28 August 2023

Opposition/Response to Motion to Dismiss: 27 September 2023

Reply re Motion to Dismiss: 11 October 2023

Order consolidating related cases: 9 November 2023

Order by Judge Araceli Martinez-Olguin granting in part and denying in part Motion to Dismiss: 12 February 2024

Motion to Intervene, enjoin Defendants and their Counsel from proceeding in substantially similar cases in the Southern District of New York: 8 February 2024

Defendants' Opposition/Response re Motion to Intervene, Enjoin Defendants and their Counsel: 22 February 2024  

Plaintiffs' Reply re Motion to Intervene, Enjoin Defendants and their Counsel: 29 February 2024

First Consolidated Amended Complaint against All Defendants: 13 March 2024

Motion to Dismiss First Consolidated Amended Complaint filed by OpenAI: 27 March 2024

Opposition/Response re Motion to Dismiss First Amended Complaint filed by Plaintiffs: 10 April 2024

Reply re Motion to Dismiss First Consolidated Amended Complaint filed by OpenAI: 17 April 2024

Summary

Three cases against OpenAI have now been consolidated (Tremblay v OpenAI, Silverman v OpenAI and Chabon v OpenAI). In the most recent development, a New York Court has rejected an application by the various Plaintiffs in California to intervene in proceedings in New York for the purpose of moving to dismiss, stay or transfer the New York actions to California.

This class action claim has been brought by two authors as individual and representative Plaintiffs against OpenAI relating to its ChatGPT large language model (LLM). The claim has been brought in the US District Court for the Northern District of California (Mona Awad voluntarily applied for the dismissal of their claim on 11 August 2023).

The Plaintiffs allege that, during the training process of its LLMs, OpenAI copied "at least Plaintiff Tremblay’s book The Cabin at the End of the World; and Plaintiff Awad’s books 13 Ways of Looking at a Fat Girl and Bunny" without their permission. Further, they argue that "because the OpenAI Language Models cannot function without the expressive information extracted from Plaintiffs’ works (and others) and retained inside them, the OpenAI Language Models are themselves infringing derivative works, made without Plaintiffs’ permission and in violation of their exclusive rights under the Copyright Act". The complaint also notes that, when prompted, ChatGPT generates summaries of the Plaintiffs' works.

Of particular relevance in this case is the datasets which OpenAI used in training its GPT models (with OpenAI having confirmed it had used datasets called Books1 and Books2 though it has not revealed the contents of those datasets).

In addition to direct and vicarious copyright infringement, the class action alleges violations of the Digital Millennium Copyright Act, unjust enrichment, violations of the California and common law unfair competition laws, and negligence.

On 28 August 2023, OpenAI filed a Motion to Dismiss a number (but not all) of the Plaintiff's claims. In particular, the Motion to Dismiss does not relate to the direct copyright claim, where OpenAI relies on a defence of fair use.

OpenAI's motion to dismiss was heard on 7 December 2023. In an order of 12 February 2024, the Court dismissed a number of claims in the Complaint, but with leave to amend in relation to the claim to vicarious infringement and the copyright management information (CMI) claim (the claim to direct infringement was not included in the motion to dismiss).  

Impact

The case has now been consolidated with the Silverman and Chabon actions against OpenAI.  OpenAI has applied to dismiss a number of the claims. 

As Open AI puts it in its Reply document, "the issue at the heart of this litigation is whether training artificial intelligence to understand human knowledge violates copyright law. It is on that question that the parties fundamentally disagree, and on which the future of artificial intelligence may turn".

18 February 2024

Silverman & ors v OpenAI (consolidated with Tremblay v OpenAI and Chabon v OpenAI)

(1) Sarah Silverman, (2) Christopher Golden & (3) Richard Kadrey v (1) OpenAI, Inc.; (2) OpenAI, L.P.; (3) OpenAI Gp, L.L.C., (4) OpenAI Opco, L.L.C. (5) OpenAI Startup Fund Gp I, L.L.C.; (6) OpenAI Startup Fund I, L.P.;(7) OpenAI Startup Fund Management, LLC 

US

Case 3:23-cv-03416

Complaint: 7 July 2023

Motion to Dismiss by OpenAI: 28 August 2023

Plaintiffs' Opposition to OpenAI's Motion to dismiss: 27 September 2023

OpenAI's Reply re Motion to dismiss: 11 October 2023

Order consolidating related cases: 9 November 2023

Order by Judge Araceli Martinez-Olguin granting in part and denying in part Motion to Dismiss: 12 February 2024

Summary 

This case has now been consolidated with Tremblay v OpenAI – see Tremblay entry for future updates.

In related proceedings to the complaint filed by Tremblay and Awad, the comedian Sarah Silverman, and other Plaintiffs as individual and representative plaintiffs have also brought proceedings against OpenAI relating to ChatGPT in the US District Court for the Northern District of California.

The Plaintiffs allege that, during the training process of its LLMs, OpenAI copied "at least Plaintiff Silverman’s book The Bedwetter; Plaintiff Golden’s book Ararat; and Plaintiff Kadrey’s book Sandman Slime." without Plaintiffs' permission. Further, it is argued that "because the OpenAI Language Models cannot function without the expressive information extracted from Plaintiffs’ works (and others) and retained inside them, the OpenAI Language Models are themselves infringing derivative works, made without Plaintiffs’ permission and in violation of their exclusive rights under the Copyright Act".

In addition to direct and vicarious copyright infringement, the class action alleges violations of the DMCA, unjust enrichment, violations of the California and common law unfair competition laws, and negligence.

On 28 August 2023, OpenAI filed a Motion to Dismiss a number (but not all) of the Plaintiff's claims. In particular, the Motion to Dismiss does not relate to the direct copyright claim, where OpenAI relies on a defence of fair use.

OpenAI's motion to dismiss was heard on 7 December 2023. In an order of 12 February 2024, the Court dismissed a number of claims in the Complaint, but with leave to amend in relation to the claim to vicarious infringement and the copyright management information (CMI) claim (the claim to direct infringement was not included in the motion to dismiss).  

Impact

The complaint has now been consolidated with the Tremblay and Chabon actions against OpenAI (see Tremblay action for further updates).

16 February 2024

Chabon & ors v Open AI (consolidated with Tremblay v OpenAI and Silverman v OpenAI)

(1) Michael Chabon (2) David Henry Hwang (3) Matthew Klam (4) Rachel Louise Snyder (5) Ayelet Waldman v (1) OpenAI, Inc. (2) OpenAI, L.P. (3) OpenAI Opco, L.L.C. (3) OpenAI GP, L.L.C. (5) OpenAI Startup Fund Gp I, L.L.C. (6) OpenAI Startup Fund I, L.P. (7) OpenAI Startup Fund Management, LLC

US

Case 3:23-cv-04625

Amended Complaint: 5 October 2023

Order consolidating related cases: 9 November 2023

Summary

This case has now been consolidated with Tremblay v OpenAI – see Tremblay entry for future updates.

A third set of proceedings has been brought against OpenAI in the US District Court for the Northern District of California.  This claim has been brought by a group of authors, playwrights and screenwriters (on both an individual and representative basis), including Pulitzer Prize winning author for fiction, Michael Chabon.

As with the other claims against OpenAI, the claims include direct and vicarious copyright infringement, violations of the DCMA, violations of California unfair competition law, negligence and unjust enrichment.  

Impact

The claims against OpenAI have now been consolidated (see Tremblay action for further updates).

15 February 2024

Authors Guild & ors v OpenAI

(1) Authors Guild (2) David Baldacci (3) Mary Bly (4) Michael Connelly (5) Sylvia Day (6) Jonathan Franzen (7) John Grisham (8) Elin Hilderband (9) Christina Baker Kline (10) Maya Shanbhag Lang (11) Victor Lavalle (12) George R.R. Martin (13) Jodi Picoult (14) Douglas Preston (15) Roxana Robinson (16) George Saunders (17) Scott Turow (18) Rachel Vail v (1) OpenAI, Inc. (2) OpenAI, L.P. (3) OpenAI Gp, LLC (4) OpenAI Opco LLC (5) OpenAI Global LLC (6) OAI Corporation LLC (7) OpenAI Holdings LLC, (8) OpenAI Startup Fund I, L.P. (9) OpenAI Startup Fund GP I, LLC (10) OpenAI Startup Fund Management, LLC 

US

Case 1:23-cv-8292 

Complaint: 19 September 2023

Amended Complaint: 5 December 2023

Amended Complaint (consolidated with Alter action): 5 February 2024

Motion to Intervene, and Dismiss, Stay or Transfer: 12 February 2024

Answer to First Consolidated Class Action Complaint by Microsoft: 16 February 2024  

Answer to First Consolidated Class Action Complaint by OpenAI: 16 February 2024  

Opposition to Motion to Intervene and Dismiss, Stay or Transfer by Microsoft: 26 February 2024  

Position Statement regarding Motion to Intervene and Dismiss, Stay or Transfer by OpenAI: 26 February 2024

Author Plaintiffs' Response to Motion to Intervene and Dismiss, Stay or Transfer: 26 February 2024

Reply to Response to Motion re Motion to Intervene and Dismiss, Stay or Transfer: 4 March 2024

Opinion & Order denying California Plaintiff's motions to intervene for purpose of transferring, staying or dismissing the New York actions: 1 April 2024

Notice of Interlocutory Appeal filed by California Plaintiffs: 15 April 2024

Summary

This case has now been consolidated with Alter v OpenAI. In the most recent development, the New York Court has rejected an application by the various Plaintiffs in California to intervene in the New York proceedings for the purpose of moving to dismiss, stay or transfer the New York actions to California.

Following other class actions brought by authors against OpenAI, this case is particularly significant for a number of reasons. First, one of the plaintiffs includes The Authors Guild, alongside 17 well-known Authors Guild members such as John Grisham, Jodi Picoult, Jonathan Franzen, George RR Martin, David Baldacci and Scott Turow. Secondly, unlike the other claims, this one has been brought in the Southern District of New York. Thirdly, whilst there is overlap in relation to the claims (in relation to direct copyright infringement, vicarious copyright infringement, contributory copyright infringement), other claims that have featured in the other cases against OpenAI have not been included.

On 5 February 2024, the Plaintiffs in this action, and in the Alter Guild action, filed a consolidated class action complaint.  The Plaintiffs in the ChatGPT litigation have filed a Motion for this case, and others filed in the Southern District of New York, to be dismissed, or stayed/transferred to the Northern District of California but this application has been rejected.

Impact

The complaint tackles the question of 'fair use' head on noting that there is "nothing fair" about what OpenAI has done, adding that its "unauthorized use of Plaintiffs' copyrighted works thus presents a straightforward infringement case applying well-established law to well-recognized copyright harms".  Whilst the other cases may be expected to settle, given that this case involves The Authors Guild, that seems much more unlikely here.

14 February 2024

Alter v OpenAI and Microsoft

Jonathan Alter, Kai Bird, Taylor Branch, Rich Cohen, Eugene Linden, Daniel Okrent, Julian Sancton, Hampton Sides, Stacy Schiff, James Shapiro, Jia Tolentino, and Simon Winchester v OpenAI, Inc., OpenAI GP, LLC, OpenAI, LLC, OpenAI Opco LLC, OpenAI Global LLC, OAI Corporation, LLC, OpenAI Holdings, LLC, and Microsoft Corporation

US

Case: 1:23-cv-10211

Complaint: 21 November 2023

Amended Complaint: 19 December 2023

Amended Complaint (consolidated with Authors Guild action): 5 February 2024

Motion to Intervene, and Dismiss, Stay or Transfer: 12 February 2024

Summary

This case has now been consolidated with Authors Guild v OpenAI (see Authors Guild entry for further updates).

This complaint is brought by a number of authors, on their own behalf and on behalf of a class against OpenAI and Microsoft, in the US District Court Southern District of New York. The claim is for infringement in the training of OpenAI and Microsoft's GPT models, as well as for contributory infringement by certain of the defendants.

On 2 February 2024, the Plaintiffs in this action, and in the Authors Guild action, filed a consolidated class action complaint. 

Impact

The initial complaint's opening paragraph stated that "the basis of the OpenAI platform is nothing less than the rampant theft of copyrighted works".  The complaint also noted that it asked ChatGPT if one of the authors' work had been included in its training data to which it answered "Yes, Julian Sancton's book 'Madhouse at the End of the Earth' is included in my training data". This is the first class action author complaint against OpenAI, that also cites Microsoft as a defendant.

13 February 2024

Basbanes & Ngagoyeanes v Microsoft and OpenAI

Nicholas A. Basbanes and Nicholas Ngagoyeanes (professionally known as Nicholas Gage) v Microsoft Corporation, OpenAI, Inc., OpenAI GP, L.L.C., OpenAI Holdings, LLC, OAI Corporation, LLC, OpenAI Global, LLC, OpenAI, L.L.C., and OpenAI OpCo, LLC,

US

Case 1:24-cv-00084 

Complaint: 5 January 2024

Motion to consolidate cases (with Authors Guild and Alter actions): 22 January 2024

Motion to Intervene, and Dismiss, Stay or Transfer: 12 February 2024

Opinion & Order denying California Plaintiff's motions to intervene for purpose of transferring, staying or dismissing the New York actions: 1 April 2024

Summary

This case has been consolidated with Authors Guild v OpenAI (see Authors Guild entry for further updates).

This class action complaint has been brought by two non-fiction authors / journalists against Microsoft and OpenAI in the US District Court Southern District of New York.  The complaint makes reference to that of the New York Times and is for direct copyright infringement, vicarious copyright infringement, and contributory copyright infringement.

The action has been be consolidated with the Authors Guild and Alter matters against OpenAI.

Impact

As the first AI infringement case issued in 2024, this case is the latest in a string of actions by authors against Microsoft and OpenAI, as well as the high profile complaint brought by The New York Times.  The complaint uses strong language describing the defendants as "no different than any other thief".

12 February 2024

The New York Times v Microsoft and OpenAI

The New York Times Company v (1) Microsoft Corporation, (2) OpenAI, Inc., (3) OpenAI LP, (4) OpenAI GP, LLC, (5) OpenAI, LLC, (6) OpenAI Opco LLC, (7) OpenAI Global LLC, (8) OAI Corporation, LLC, (9) OpenAI Holdings, LLC

US

CASE 1:23-cv-1195

Complaint: 27 December 2023

Motion to Intervene, and Dismiss, Stay or Transfer: 23 February 2024

Motion to Dismiss: 26 February 2024 

Response to Motion to Intervene and Dismiss, Stay or Transfer by OpenAI: 26 February 2024

Response to Motion to Intervene and Dismiss, Stay or Transfer by The New York Times: 1 March 2024

Motion to Dismiss by Microsoft: 4 March 2024

Reply to Opposition to Motion to Intervene and Dismiss, Stay or Transfer: 8 March 2024

Plaintiff's Memorandum of Law in Opposition to OpenAI's Partial Motion to Dismiss: 11 March 2024

Reply Memorandum of Law in Support of Motion by OpenAI: 18 March 2024

Plaintiff's Memorandum of Law in Opposition to Microsoft's Partial Motion to  Dismiss: 18 March 2024

Reply Memorandum of Law in Support re Motion to Dismiss filed by Microsoft Corporation: 25 March 2024

Opinion & Order denying California Plaintiff's motions to intervene for purpose of transferring, staying or dismissing the New York actions: 1 April 2024

Notice of Interlocutory Appeal filed by California Plaintiffs: 15 April 2024

Notice of Motion and Motion for Leave to File First Amended Complaint: 20 May 2024

Letter Motion to Compel New York Times to Produce Documents: 23 May 2024

Letter Response in Opposition to Motion to Compel New York Times to Produce Documents: 28 May 2024

Opposition Brief filed by Microsoft Corporation: 3 June 2024

Response to Motion for Leave to File First Amended Complaint and Conditional Cross-Motion filed by OpenAI: 3 June 2024

Summary

The most recent development in this much-watched case concerns discovery with OpenAI seeking to compel The New York Times to produce documents relating to how it generated the regurgitated content set out as examples of infringing outputs in Exhibit J to its complaint. The New York Times has responded that it does not intend to rely on Exhibit J at trial, provided that OpenAI responds to its discovery requests. Separately, The New York Times is seeking to add approximately 7 million additional works to the complaint.

This highly publicised case has been brought by The New York Times against Microsoft and OpenAI in the US District Court Southern District of New York, relating to ChatGPT (including associated offerings), Bing Chat and Microsoft 365 Copilot. It follows a period of months during which the NYT said it attempted to reach a negotiated agreement with Microsoft/OpenAI.

The Complaint raises arguments of large-scale commercial exploitation of NYT content, through the training of the relevant models (including GPT-4 and the next generation GPT-5), noting that the GPT LLMS have also 'memorized' copies of many of the woks encoded into their parameters.  There are extensive exhibits (69 exhibits, comprising around 2000 pages) attached to the Complaint. Exhibit J in particular contains 100 examples of output from GPT-4 (as a 'small fraction') based on prompts in the form of a short snippet from the beginning of an NYT article.  The example outputs are said to recite NYT content verbatim (or near-verbatim), closely summarise it, and mimic its expressive style (and also wrongly attribute false information - hallucinations - to NYT).

The Complaint also focuses on synthetic search applications built on the GPT LLMs which display extensive excepts or paraphrases of contents of search results, including NYT content, that may not have been included in the model's training set (noting that this contains more expressive content from the original article than would be the case in a traditional search result, and without the hyperlink to the NYT website).

The claims are for direct copyright infringement,  vicarious copyright infringement, contributory copyright infringement, DMCA violations, unfair competition by misappropriation, and trade mark dilution.

On 26 February 2024, OpenAI filed a Motion to Dismiss in relation to parts of the claim to direct copyright infringement (re conduct occurring more than 3 years ago), as well as the claims relating to contributory infringement, DMCA violations and state common law misappropriation. In particular, OpenAI alleges that the 'Times paid someone to hack OpenAI's products' and that it took 'tens of thousands of attempts to generate the highly anomalous results' in Exhibit J to the Complaint, including by targeting and exploiting a bug (which OpenAI says it has committed to addressing) in violation of its terms of use. OpenAI goes on to categorise the key dispute in the case as to whether it is fair use to use publicly accessible content to train generative AI models to learn about language, grammar and syntax, and to 'understand the facts that constitute humans' collective knowledge'. The New York Times has categorised OpenAI's motion as grandstanding, with an attention-grabbing claim about 'hacking' that is both irrelevant and false.

Microsoft filed its Motion to Dismiss parts of the claim on 4 March 2024 focusing on (1) the allegation that Microsoft is contributorily liable for end-user infringement (2) violation of DMCA copyright management information and (3) state law misappropriation torts. Drawing an analogy with earlier disruptive technologies, the Motion states "copyright law is no more an obstacle to the LLM than it was to the VCR (or the player piano, copy machine, personal computer, internet, or search engine)"- its point is that the US Supreme Court has previously rejected liability merely based on offering a multi-use product that could be used to infringe. It further states that Microsoft "looks forward to litigating the issues in this case that are genuinely presented, and to vindicating the important values of progress, learning and the sharing of knowledge".

Impact

The opening words of the complaint stress the importance of independent journalism for democracy - and the threat to the NYT's ability to provide that service by the use of its works to create AI products. It further highlights the role of copyright in protecting the output of news organisations, and their ability to produce high quality journalism.

The NYT website is noted in the Complaint as being the most highly represented proprietary source of data in the Common Crawl dataset, itself the most highly weighted dataset in GPT-3. Given the previous attempt at negotiations referred to in the complaint, it will be interesting to see if the launch of this complaint will lead to more fruitful licence negotiations, or whether this case will continue to trial (in which case, it should be tracked alongside the other complaints against OpenAI and Microsoft).

OpenAI's position is that 'training data regurgitation' (or memorisation) and hallucination are 'uncommon and unintended phenomena'. Memorisation is a problem that OpenAI say that they are working hard to address, including through sufficiently diverse datasets. Meanwhile, it points to its partnerships with other media outlets.

11 February 2024

Chabon & ors v Meta Platforms, Inc

(1) Michael Chabon (2) David Henry Hwang (3) Matthew Klam (4) Rachel Louise Snyder (5) Ayelet Waldman v Meta Platforms Inc

US

Case: 4:23-cv-04633

Amended Complaint: 5 October 2023

Order granting Joint Motion to Dismiss (for reasons given in Kadrey v Meta Platforms): 20 November 2023

Order consolidating cases against Meta: 7 December 2023

Summary

The same set of authors, playwrights and screenwriters in the third set of proceedings against OpenAI also brought a claim against Meta in the US District Court for the Northern District of California.  This case focused on Meta's LLaMa (Large Language Model Meta AI) and noted Meta's statements that LLaMa was trained using books including from the Books3 section of ThePile dataset (assembled from content available in 'shadow library' websites (including Bibliotik)), which the Plaintiffs contended includes their copyright works.

Again, the claims include direct and vicarious copyright infringement, violations of the DCMA, violations of California unfair competition law, negligence and unjust enrichment. 

Impact

Developments in all of these cases should be monitored closely. The case has now been consolidated with another claim against Meta (Kadrey et al v Meta).

9 February 2024

Kadrey & ors v Meta Platforms, Inc

(1) Richard Kadrey (2) Sarah Silverman & (3) Christopher Golden v Meta Platforms, Inc

US

Case C 3:23-cv-03417

Complaint: 7 July 2023

Motion to dismiss by Meta: 18 September 2023

Plaintiffs' Opposition to Meta's Motion to dismiss: 18 October 2023

Reply re Motion to Dismiss: 1 November 2023

Order on Motion to Dismiss: 20 November 2023

Amended Complaint: 11 December 2023

Answer to Amended Complaint: 10 January 2024 

Motion to relate with Huckabee action: 16 January 2024

Order granting motion to relate with Huckabee action: 23 January 2024

Summary

Plaintiffs have brought a class action against Meta relating to its LLaMA (Large Language Model Meta AI) product in the US District Court for the Northern District of California. The claim notes Meta's statements that LLaMa was trained using books including from the Books3 section of ThePile dataset (assembled from content available in 'shadow library' websites (including Bibliotik)), which the Plaintiffs content includes their copyright works.

The claims include direct and vicarious copyright infringement, violations of the DCMA, violations of California unfair competition law, negligence and unjust enrichment. 

Impact

Meta filed a Motion to Dismiss parts of the claim – the Motion to Dismiss only applies partially to the claim of direct infringement. On this, Meta's Motion states: "Use of texts to train LLaMA to statistically model language and generate original expression is transformative by nature and quintessential fair use—much like Google’s wholesale copying of books to create an internet search tool was found to be fair use in Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015)." Clearly, the issue of fair use is going to be central to this debate.

On Thursday 9 November 2023, US District Judge Vince Chhabria indicated that he would grant Meta's motion to dismiss the claims that content generated by Meta's LLaMA tool infringes their copyright (and also that LLaMA is itself an infringing work), but would give the plaintiffs permission to amend most of their claim.

On 11 December 2023, the Plaintiffs filed their amended Complaint. The claim has been consolidated with that brought by a number of authors including Michael Chabon.

8 February 2024

Andersen v Stability AI

(1) Sarah Andersen, (2) Kelly McKernan & (3) Karla Ortiz v (1) Stability AI Ltd, (2) Stability AI, Inc, (3) Midjourney, Inc, (4) Deviantart, Inc 3. 

US

CASE 3:23-CV-00201

Complaint: 13 January 2023

Defendants filed a number of motions to dismiss and/or Anti-SLAPP Motions to Strike: 18 April 2023

Plaintiffs opposed these motions: 2 June 2023

Defendants filed motions to dismiss and/or motions to dismiss and strike: 3 July 2023

Judge Orrick indicated he would dismiss most of the claims brought by the Plaintiffs against the Defendants with leave to amend: 19 July 2023

Order by Judge William H Orrick: 30 October 2023

Amended Complaint: 29 November 2023

Motion to  Strike (DeviantArt's Motion to Renew its Special Motion to Strike (anti-SLAPP)): 20 December 2023

Opposition/Response re anti-SLAPP motion: 10 January 2024

Reply re anti-SLAPP motion: 17 January 2024

Motion to Dismiss First Amended Complaint filed by Midjourney: 8 February 2024

Motion to Dismiss First Amended Complaint filed by Stability AI: 8 February 2024

Motion to Dismiss First Amended Complaint filed by DeviantArt: 8 February 2024

Motion to Dismiss First Amended Complaint filed by Runway: 8 February 2024

Order denying Motion to Strike by Judge William H. Orrick: 8 February 2024

Opposition/Response re Stability AI's Motion to Dismiss filed by Plaintiffs: 21 March 2024

Opposition/Response re Runway AI's Motion to Dismiss filed by Plaintiffs: 21 March 2024

Opposition/Response re DeviantArt's Motion to Dismiss filed by Plaintiffs: 21 March 2024

Opposition/Response re  Midjourney's Motion to Dismiss filed by Plaintiffs: 21 March 2024

Reply re Motion to Dismiss Plaintiffs' First Amended Complaint filed by MidJourney: 18 April 2024

Reply re Motion to Dismiss Plaintiffs' First Amended Complaint filed by StabilityAI: 18 April 2024

Reply re Motion to Dismiss Plaintiffs' First Amended Complaint filed by DeviantArt: 18 April 2024

Reply re Motion to Dismiss Plaintiffs' First Amended Complaint filed by Runway AI: 18 April 2024

Procedures and tentative rulings for hearing: 7 May 2024

Summary

In the most recent development, the Court has given a tentative ruling in advance of a hearing on 8 May 2024 in which Judge Orrick has indicated that he is inclined to deny the Defendants' motions to dismiss the claims of direct and induced infringement and that these claims should therefore proceed (alongside other tentative findings).

This is a case brought against Stability AI (and other AI tools such as Midjourney), this time by a group of visual artists acting as individual and representative plaintiffs. The claim was filed in the US District Court for the Northern District of California.

The Plaintiffs have filed for copyright infringement, Digital Millennium Copyright Act violations, and related state law claims. They allege that the Defendants used their (and other artists’) works to train Stable Diffusion without obtaining their permission. According to the Plaintiffs, when the Defendants’ AI tools create "new images" based entirely on the training images, they are creating an infringing derivative work.

The Plaintiffs seek to bring their suit as a class action on behalf of "millions of artists" in the U.S. that own a copyright in any work that was used to train any version of the AI tools. 

On 19 July 2023, Judge Orrick indicated in a tentative ruling that he would dismiss almost all of the claims against the Defendants but would give the Plaintiffs leave to amend. Of particular note is that the Judge stated that the Plaintiffs need to differentiate between the Defendants and elaborate on what role each of the Defendants played with respect to the allegedly infringing conduct. The Judge was sceptical as to the extent the AI tool relied on the Plaintiffs' works to generate the output images as the AI model contained billions of images. He also expressed doubts as to whether the output images were substantially similar to the Plaintiff's original works.

On 30 October 2023, Judic Orrick's order was published, dismissing parts of the claim. However, the Plaintiffs were given leave to amend, with the Judge requiring them to clarify their infringement claims. Stability Ai's motion to dismiss the claim against it for direct copyright infringement was denied.

On 29 November 2023, the Plaintiffs filed their Amended Complaint, which included a number of new plaintiffs joining the complaint.

On 8 February 2024, Judge Orrick denied the Defendants' motion to strike under California's anti-SLAPP (strategic lawsuits against public participation) statute which had been directed solely at the Plaintiffs' right of publicity claims, on the basis that the Complaint and Amended Complaint fell within the anti-SLAPP statute's public interest exception.

On 7 May 2024, Judge Orrick issued a number of tentative rulings in advance of a hearing on 8 May as follows:

  • He is inclined to deny all of the Defendants' motions to dismiss the direct and induced infringement claims. Beyond the Training Images theory, the Plaintiffs have plausibly alleged facts to suggest compressed copies of their works are contained in Stable Diffusion.
  • He is inclined to grant the Defendants' motions to dismiss all DMCA claims.
  • He is inclined to deny Midjourney's motions to dismiss the false endorsement and trade dress claims.
  • He is inclined to grant DeviantArt's motions to dismiss the contract claim for express breach and breach of the implied covenant of good faith and fair dealing.
  • Plaintiffs to be allowed to file a Second Amended Complaint adding new Plaintiffs
  • New unjust enrichment claims cannot be added but the Plaintiffs will be allowed one final attempt to allege non-preempted unjust enrichment claims against specific Defendants.

Impact

In this case, one of the Plaintiffs' arguments is that AI tools which create art “in the style of” an existing artist are infringing derivative works. Copyright infringement requires copying, so the Plaintiffs will have to convince the court that a completely new piece of art “in the style of” an existing artist could be categorised as “copying” that artist.

7 February 2024

Getty Images v Stability AI

Getty Images (US), Inc. v Stability AI Ltd 

US

Case 1:23-cv-00135-UNA

Complaint: 3 February 2023

Amended Complaint: 29 March 2023.

Defendants' Motion to Dismiss or Transfer: 2 May 2023.

Summary

In addition to its claim against Stability AI in the UK, Getty Images has brought proceedings in the US District Court of Delaware.

Getty Images' complaint is for copyright infringement, providing false copyright management information, removal or alteration of copyright management information, trademark infringement, unfair competition, trademark dilution, and related state law claims.

In response to Getty Images' amended complaint, Stability AI filed a motion to dismiss for lack of personal jurisdiction, inability to join a necessary party, and failure to state a claim, or alternatively, a motion to transfer the lawsuit to the US District Court for the Northern District of California.

Impact

This case should be tracked alongside the action in the UK, though different issues may arise for consideration given potential divergences e.g., in relation to defences to copyright infringement.

6 February 2024

Getty Images v Stability AI

(1) Getty Images (US), Inc. (2) Getty Images International U.C. (3) Getty Images (UK) Ltd (4) Getty Images Devco UK Ltd (5) Stockphoto LP (6) Thomas M. Barwick, Inc v Stability AI Ltd 

UK

Claim No. IL-2023-000007

Claim Form: 16 January 2023

Particulars of Claim: 12 May 2023

Judgment on Stability AI's summary judgment/strike out application: 1 December 2023

Defence: 27 February 2024

Reply: 26 March 2024

Summary

Following Stability AI's unsuccessful attempt to obtain summary judgment / strike out of Getty Image's claim, it has now filed its Defence in this matter.  This raises a number of broad issues, with the focus being on its argument that the training and development of the model took place outside of the UK. Further, it argues that the examples of infringing outputs relied upon by Getty were generated by 'wilful contrivance' but that, in any event, the act of generating outputs is that of the user, of whom it has no control or knowledge of its prompts. Interestingly, the defence also raises the fair dealing defence for the purposes of pastiche.

This claim has been brought by Getty Images against AI image generator Stability AI in the UK High Court.

Getty Images' claim (as summarised in its press release when commencing the claim) is that, through its Stable Diffusion model (under the name DreamStudio), Stability AI has "unlawfully copied and processed millions of images protected by copyright and the associated metadata owned or represented by Getty Images absent a license to benefit Stability AI's commercial interests and to the detriment of content creators".

The claims relate to copyright infringement, database right infringement, and trade mark infringement and passing off.

In brief, Getty Images claims that Stable Diffusion was trained using various subsets of the LAION-5B Dataset which was created by scraping links to photos and videos and associated captions from various websites: Getty Images claims that Stable Diffusion 1.0 was trained using around 12 million visual assets (of which around 7.3 million are copyright works) from Getty Images websites. It further claims that Stable Diffusion 2.0 was trained using around 7.5 million visual assets (of which around 4.4 million are copyright works) from Getty Images websites.

Getty Images also claims that in some cases the synthetic image produced by a user comprises a substantial part of one or more of its copyright works and/or visual assets, suggesting that Stable Diffusion sometimes memorises and generates very similar images to those used to train it. In some cases, the synthetic images produced bear the GETTY IMAGES and ISTOCK signs as a watermark.

Getty Images seeks to restrain the Defendant from doing a number of acts in the UK, without a written licence or agreement from Getty Images.

Stability AI applied for summary judgment / strike out in respect of certain aspects of Getty Images' claim. In particular, it argued that, as the evidence indicated that the training and development of Stable Diffusion took place outside the UK, the claim relating to copyright and database right infringement in that process was bound to fail. On 1 December 2023, the Court rejected Stability AI's application. Whilst the evidence referred to would on its face provide strong support for a finding that no development or training had taken place in the UK, there was other evidence pointing away from that conclusion, as well as a number of unanswered questions and inconsistencies in the evidence. Accordingly, the Court allowed that claim to proceed to trial, alongside a claim for secondary infringement of copyright which again the Court concluded could not be determined on a summary basis. 

On 27 February 2024, Stability AI filed its Defence. In summary, it denies that:

  • Development and training of the Stable Diffusion models infringed any of Getty Images' IP rights on the basis that the models were trained and developed outside the UK.
  • Making the Stable Diffusion model checkpoints available for download on GitHub or Hugging Face, or for use via DreamStudio, involves any acts of secondary infringement (because Stable Diffusion is not an infringing copy, is not an article, and has not been imported into the UK by Stability).
  • Use of Stability Diffusion by users gives rise to claims of infringement.  In particular, it argues that the examples of infringing outputs relied upon were generated by 'wilful contrivance using prompts corresponding exactly or substantially to captions' for Getty Images' works. It further asserts that the act of generating outputs is that of the user (over whom it has no control or knowledge of its prompts), not Stability; it has not made any use of the Getty trade marks in the course of trade; and it is entitled to rely upon caching and hosting safe harbours.

Interestingly, Stability AI also assert that to the extent that any images do include any element of a copyright work, it is possible to rely upon the fair dealing defence for the purposes of pastiche (a defence which has not yet been the subject of significant judicial commentary, other than in the Shazam case relating to Only Fools and Horses).   

Impact

As noted by Peter Nunn in an article in The Times:

"If Getty Images is successful in the UK claim, the court could award it substantial damages and grant an injunction preventing Stability AI from continuing to use the copyright works of Getty Images. This could have knock-on effects, deterring other AI innovators from scraping the internet to use content without the owners’ consent, but also prompting governments to speed up changes to their intellectual property laws so as to permit greater use of protected works in the training of AI programmes."

Peter Nunn discusses AI 'plagiarism' in The Times (mishcon.com)

 

4 February 2024

Huckabee & ors v Meta, Bloomberg, Microsoft and The Eleutherai Institute

(1) Mike Huckabee (2) Relevate Group (3) David Kinnaman (4) TSH Oxenreider (5) Lysa Terkeurst (6) John Blase v (1) Meta Platforms, Inc. (2) Bloomberg L.P. (3) Bloomberg Finance L.P. (4) Microsoft Corporation (5) The Eleutherai Institute

US

Case: 1:23-cv-09152

Complaint: 17 October 2023

Letter re Bloomberg's proposed Motion to Dismiss: 15 December 2023

Letter re Opposition to Bloomberg's proposed Motion to Dismiss: 22 December 2023

Notice of Voluntary Dismissal re The Eleutherai Institute: 28 December 2023

Notice severing and transferring claims against Meta and Microsoft to US District Court for the Northern District of California: 28 December 2023

First Amended complaint against Bloomberg Finance: 24 January 2024

Letter re Bloomberg's proposed Motion to Dismiss: 31 January 2024

Motion to Dismiss by Bloomberg (Memorandum of Law): 22 March 2024

Plaintiffs' Opposition to Motion to Dismiss: 19 April 2024

Reply Memorandum of Law in Support of Motion: 3 May 2024

Summary

There have been some changes to the parties in this case, with the complaint against The Eleutherai Institute being voluntarily dismissed and the complaints against Meta and Microsoft severed and transferred.

Former Presidential Candidate and former Governor of Arkansas Mike Huckabee and a group of other plaintiffs have brought a class action against Meta, Bloomberg, Microsoft and The Eleutherai Institute in the United States District Court Southern District of New York. The complaint focuses on EleutherAI's dataset called 'The Pile' which includes in its data sources, 'Books 3', a dataset of a large collection (said to be approximately 18,000) of pirated ebooks.  The complaint notes that The Pile, and specifically Books3, was a popular training data set for companies developing AI technology, including the Defendants in this case.

As in other cases, the complaint alleged direct copyright infringement, vicarious copyright infringement, DCMA claims (removal of copyright management information), conversion, negligence, and unjust enrichment.

The Plaintiffs have since voluntarily dismissed the complaint against The Eleutherai Institute, and the complaints against Meta and Microsoft have been severed and transferred to California. In the Amended Complaint filed in January 2024, the Plaintiffs have withdrawn their indirect copyright infringement, DCMA and state-law claims, leaving the direct copyright infringement claim to be argued. 

Impact

This is the first case involving Bloomberg, which the complaint notes launched the world's first LLM built from scratch for finance. The complaint notes that Bloomberg had stated that it would not use the Books3 dataset used to training future versions of BloombergGPT, but further notes that LLM training is iterative and builds on prior versions, with the Plaintiff's works 'baked in' already.  

2 February 2024

J.Doe 1 and J.Doe 2 v Github, Microsoft and OpenAI

J. DOE 1 and J. DOE 2, individually and on behalf of all others similarly situated, Individual and Representative Plaintiffs v. (1) Github, Inc. (2) Microsoft Corporation; (3) OpenAI, Inc.; (4) OpenAI, L.P.; (5) OpenAI Gp, L.L.C., (6) OpenAI Opco, L.L.C. (7) OpenAI Startup Fund Gp I, L.L.C.; (8) OpenAI Startup Fund I, L.P.; (9) OpenAI Startup Fund Management, LLC   

US

Case 3:22-cv-06823

Complaint: 3 November 2022

Open AI motion to dismiss: 26 January 2023

Microsoft and Github's motion to dismiss: 26 January 2023

Plaintiffs' amended complaint: 8 June 2023

OpenAI motion to dismiss amended complaint: 29 June 2023

Microsoft and Github motion to dismiss amended complaint: 29 June 2023 

Amended Complaint: 21 July 2023

Opposition/Response to Motion to Dismiss: 27 July 2023

Reply by Github, Microsoft: 10 August 2023

Reply by OpenAI: 10 August 2023

Order granting in part, denying in part Motion to Dismiss: 3 January 2024

Second Amended Complaint: 25 January 2024

Motion to Dismiss Second Amended Complaint: 28 February 2024

Opposition/Response re Github and Microsoft's Motion to Dismiss Portions of the Second Amended Complaint in Consolidated Actions filed by Plaintiffs: 27 March 2024

Opposition/Response re OpenAI's Motion to Dismiss Portions of the Second Amended Complaint in Consolidated Actions filed by Plaintiffs: 27 March 2024

Reply filed by Github and Microsoft: 10 April 2024

Reply filed by OpenAI: 10 April 2024

Order denying Plaintiffs' Motion for Reconsideration re Order on Motion to Dismiss: 15 April 2024

Summary

This class-action brought in the US District Court for the Northern District of California targets both Copilot and OpenAI's Codex tool, which provides the technology underlying Copilot. Copilot helps developers write code by generating suggestions based on what it has learned from code in the public domain.  

The complaint focuses on four key areas:

  • An allegation that Copilot violates provisions of the Digital Millennium Copyright Act by ingesting and distributing code snippets (copyrighted information) without including the licence terms, copyright notice and author attribution.
  • An allegation that, by not complying with open licence notices, Copilot breaches the conditions of such licences by which the original code had been made available to Copilot/Codex.
  • An allegation that Copilot passes off code as an original creation and therefore GitHub, Microsoft and OpenAI have been unjustly enriched by Copilot's subscription based service. This is a claim for unlawful competition.
  • An allegation that Github violates the Class's rights under the Californian Privacy Act, Github Privacy Statement and/or the Californian Constitution by inter alia sharing the Class's sensitive personal information; creating a product that contains personal data GitHub cannot delete, alter nor share with the applicable Class member; and selling the Class's personal data.

The Plaintiffs are seeking damages and injunctive relief.

The Defendants have alleged that the complaint lacks standing and have filed for the complaint to be dismissed. After being granted leave to amend their complaint, the Plaintiffs filed an amended complaint in June 2023, which largely resembled their initial complaint but including examples of licensed code owned by three of the Plaintiffs that has been output by Copilot, arguing that this demonstrates the Defendants removed their Copyright Management Information and emitted their code in violation of their open-source licences. On 3 January 2024, the Court granted GitHub's motions to dismiss in part. In particular, the Judge held that the remaining two Plaintiffs had not established a 'particular personalized injury' to confer standing for damages, though this was satisfied for the three Plaintiffs referred to above. The Judge also held that the state law claims of intentional and negligent interference with prospective economic relations, unjust enrichment, negligence and unfair competition are pre-empted by the Copyright Act. The claims under the DCMA were also dismissed with leave to amend.

On 7 September 2023, Microsoft issued a Copilot Copyright Commitment in which it offers to defend customers from IP infringement claims (i.e., any adverse judgments or settlements) arising from their use and distribution of output content generated by Microsoft's Copilot services. This is subject to certain criteria including that the customer used the guardrails and content filters built into the products.

Impact

The open source community will be watching this case with particular interest.

In its motion to dismiss GitHub draws attention to its Terms of Service with respect to ownership of code generated by GitHub Copilot. This is simplified in their FAQ section (see 'Does GitHub own the code generated by GitHub Copilot?) where GitHub suggests that "Copilot is a tool, like a compiler or pen" and, as a result, "the code you write with GitHub Copilot's help belongs to you".

However, the default legal position is not so clear-cut. Whilst GitHub has no interest in owning Copilot-generated source code that is incorporated into a developer's works, it's not clear whether the terms in Copilot's terms of use effectively assign IP rights to the developer.

You can read more about these issues here.

Any developers that use Copilot/Codex should review all output thoroughly and run open-source software audits to identify any potential issues.

Microsoft's Copyright Commitment will be effective from 1 October 2023 and applies to paid versions of Microsoft commercial Copilot services and Bing Chat Enterprise.

1 February 2024

Concord Music Group & ors v Anthropic PBC

Concord Music Group, Inc.; Capitol Cmg, Inc. D/B/A Ariose Music, D/B/A Capitol Cmg Genesis, D/B/A Capitol Cmg Paragon, D/B/A Greg Nelson Music, D/B/A Jubilee Communications, Inc., D/B/A Meadowgreen Music Company, D/B/A Meaux Hits, D/B/A Meaux Mercy, D/B/A River Oaks Music, D/B/A Shepherd’s Fold Music, D/B/A Sparrow Song, D/B/A Worship Together Music, D/B/A Worshiptogether.com Songs; Universal Music Corp. D/B/A Almo Music Corp., D/B/A Criterion Music Corp., D/B/A Granite Music Corp., D/B/A Irving Music, Inc., D/B/A Michael H. Goldsen, Inc., D/B/A Universal – Geffen Music, D/B/A Universal Music Works; Songs Of Universal, Inc. D/B/A Universal – Geffen Again Music, D/B/A Universal Tunes; Universal Music – Mgb Na Llc D/B/A Multisongs, D/B/A Universal Music – Careers, D/B/A Universal Music – Mgb Songs; Polygram Publishing, Inc. D/B/A Universal – Polygram International Tunes, Inc., D/B/A Universal – Polygram International Publishing, Inc., D/B/A Universal – Songs Of Polygram International, Inc.; Universal Music – Z Tunes Llc D/B/A New Spring Publishing, D/B/A Universal Music – Brentwood Benson Publishing, D/B/A Universal Music – Brentwood Benson Songs, D/B/A Universal Music – Brentwood Benson Tunes, D/B/A Universal Music – Z Melodies, D/B/A Universal v Anthropic Pbc, 

US

Case: 3:23-cv-01092

Complaint: 18 October 2023

Motion for a preliminary injunction: 16 November 2023

Motion to Dismiss by Anthropic: 22 November 2023

Opposition to motion for preliminary injunction: 16 January 2024

Opposition to motion to dismiss: 22 January 2024

Reply to Response re Motion for Preliminary Injunction: 14 February 2024

Summary

A number of music publishers (comprising Concord, Universal and ABKCO) have brought an action against Anthropic in the United States District Court for the Middle District of Tennessee Nashville Division. The complaint has been brought in order to "address the systematic and widespread infringement of their copyrighted song lyrics" alleged to have taken place during the process of Anthropic building and operating its AI models referred to as 'Claude'.  In particular, the complaint notes that when a user prompts Claude to provide the lyrics to a particular song, its response will provide responses that contain all or significant portions of those lyrics. Further, when Clause is requested to write a song about a certain topic, the complaint alleges that this can involve reproduction of the publishers' copyrighted lyrics – for example, when asked to write a song "about the death of Buddy Holly", it responded by generating output that copies directly from the song "American Pie".

The complaint contains claims relating to direct copyright infringement, contributory infringement, vicarious infringement, and DCMA claims (removal of copyright management information).   

In its response to the Plaintiffs' motion for a preliminary injunction, Anthropic argues that the Plaintiffs devised 'special attacks' in order to evade Claude's built-in guardrails and to generate alleged infringements through 'trial and error'.  It also relies upon the use of copyrighted material as inputs as 'fair use'.

Impact

This is the first case involving the music industry, and also the AI tool developer Anthropic. There are a number of websites which currently aggregate and publish music lyrics – however, this is through an existing licensing market by which the publishers license their copyrighted lyrics.

31 January 2024

Raw Story Media, Inc v OpenAI Inc

Raw Story Media, Inc., Alternet Media, Inc., v OpenAI, Inc., OpenAI GP, LLC, OpenAI, LLC, OpenAI Opco  LLC, OpenAI Global LLC, OAI Corporation LLC, OpenAI Holdings, LLC

US

Case: 1:24-cv-01514

Complaint: 28 February 2024

Motion to Dismiss filed by OpenAI: 29 April 2024

Memo in opposition to Motion to Dismiss: 13 May 2024

Reply to Memo in opposition to Motion to Dismiss: 20 May 2024 

 

Summary

This complaint, which has been brought by two news organisations in the US District Court Southern District of New York, is unusual because it does not include claims for copyright infringement. Instead, it alleges violations of the Digital Millennium Copyright Act in that thousands of the Plaintiffs' works were included in training sets with the author, title, and copyright infringement removed.

Impact

Presumably, copyright infringement claims have not been included, because the works in question are perhaps not registered.

30 January 2024

Thomson Reuters v Ross Intelligence

(1) Thomson Reuters Enterprise Centre Gmbh and (2) West Publishing Corp., v Ross Intelligence Inc.,

US

Case 1:20-cv-00613 

Memorandum Opinion: 25 September 2023

Trial on copyright issues: 26 August 2024

Summary

In May 2020, Thomson Reuters and West Publishing Corporation (the Plaintiffs) filed a claim for copyright infringement against ROSS Intelligence Inc. (ROSS). In their claim, the Plaintiffs allege that ROSS “illicitly and surreptitiously” used a third-party Westlaw licensee, LegalEase Solutions which in turn, hired a subcontractor, Morae Global to access and copy the Plaintiffs’ proprietary content on the Westlaw database. It is alleged that ROSS used the content to train its machine learning model to create a competing product.

The Plaintiffs are seeking injunctive relief and damages that they have suffered as a result of ROSS’ direct, contributory, and vicarious copyright infringement and intentional and tortious interference with contractual relations.

In the Memorandum Opinion issued in September 2023, Judge Stephanos Bibas denied the Platintiffs and ROSS's cross-motions for summary judgment finding that only a jury can evaluate the four factors required for the fair-use defence to copyright infringement. These four factors are: (1) the purpose and character of the use, (2) the nature of the copyrighted work, (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole, and (4) the effect of the use upon the potential market for the copyrighted work. A trial is tentatively set for May 2024.

Impact

If this goes to trial, the case will be one of the first to test whether copyright owners can prevent businesses using copyrighted works for the purpose of training machine learning models for AI tools.

Summary

This case concerns whether copyright can be registered in a creative work made by artificial intelligence – specifically a piece called 'A Recent Entrance to Paradise' which was created autonomously by an AI tool (the AI tool, Creativity Machine, was created by Dr Thaler who listed the system as the work's creator and himself as the 'Copyright Claimant' as 'a work-for-hire to the owner of the Creativity Machine').

The work was denied registration by the US Copyright Office on the basis there was no human author to support a claim to copyright registration. The proceedings in the US District Court for the District of Columbia seek to overturn the USCO refusal to register. The case was therefore a judicial review hearing of the Copyright Office's decision as a final agency decision.

Following cross motions for summary judgment, on 18 August 2023, Judge Beryl A. Howell issued an Order (and accompanying Memorandum Opinion) dismissing the Plaintiff's motion for summary judgment and granting the Defendants' cross-motion for summary judgment.

The Judge concluded that the Registrar had not acted arbitrarily or capriciously in reaching its conclusion that the copyright registration should be denied.    

Dr Thaler has filed a Notice of Appeal to the US Court of Appeals for the District of Columbia Circuit. The US Copyright Office has filed its Reply Brief in which it asserts that human authorship is a basic requisite to obtain copyright protection, based on a straightforward application of the statutory text, history and precedent.  The Brief argues that the Copyright Act's plain text and structure establish a human authorship requirement. In terms of precedent, since the 19th century, the Supreme Court has recognised human creativity as the touchstone of authorship. It further argues that Dr Thaler has offered no sound reason to depart from these 'bedrock principles'.

Impact

Unusually, the question here was purely a legal one: are AI-generated works (created autonomously without any human input) copyrightable?

Thaler's argument is that AI generated works deserve copyright protection as a matter of policy. The Judge said that "copyright has never stretched so far, however, as to protect works generated by new forms of technology absent any guiding human hand … human authorship is a bedrock requirement of copyright".

The position on whether content created by AI generators is protectable differs from country to country (as noted below re the position in the UK as compared to the US). We have written about this here

See below also for the US Copyright Office Statement of practice in relation to works containing material generated by AI, which is to the effect that only the human created parts of a generative AI work are protected by copyright.

It appears that in presenting argument to the Court, the Plaintiff implied a level of human involvement in the creation of the work, that was not in accordance with the administrative record before the Copyright Office which was to the effect that the work had been generated by the AI system autonomously and that he had played no role in its creation.

Legislative and policy developments

11 April 2024

The Generative AI Copyright Disclosure Bill

US

Introduced by Representative Adam Schiff: 9 April 2024

Summary

Introduced by Democratic Representative Adam Schiff, The Generative AI Copyright Disclosure Act would require a notice to be submitted to the Register of Copyrights prior to a new generative AI system being released, providing information on all copyrighted works used in building or altering the training dataset. It would also apply retroactively to existing genAI systems.

Impact

The Bill has attracted widespread support from across the creative community including from industry associations and Unions such as the Recording Industry Association of America, Copyright Clearance Center, Directors Guild of America, Authors Guild, National Association of Voice Actors, Concept Art Association, Professional Photographers of America, Screen Actors Guild-American Federation of Television and Radio Artists, Writers Guild of America West, Writers Guild of America East, American Society of Composers, Authors and Publishers, American Society for Collective Rights Licensing, International Alliance of Theatrical Stage Employees, Society of Composers and Lyricists, National Music Publishers Association, Recording Academy, Nashville Songwriters Association International, Songwriters of North America, Black Music Action Coalition, Music Artist Coalition, Human Artistry Campaign, and the American Association of Independent Music.

12 February 2024

UK approach to text and data mining

UK

UKIPO Code of Practice: On 6 February 2024, the UK Government confirmed it had not been possible to reach an agreement on a voluntary Code of Practice

Summary

In 2021, the UK Intellectual Property Office (UKIPO) consulted on potential changes to the UK's IP framework as a result of AI developments (importantly, this was before the increased levels of interest following the launch of ChatGPT etc).

In particular, a number of policy options were considered relating to the making of copies for the purposes of text and data mining (TDM), a crucial tool in the development and training of AI tools. Currently, an exception is in place under UK copyright law to allow copying for the purposes of TDM, but only where it is for the purpose of non-commercial research, and only where the researcher has lawful access to the works.

Alongside retaining the current exception, or simply improving the licensing environment for relevant works, the consultation sought views on three alternative options:

  • Extend the TDM exception to cover commercial research.  
  • Adopt a TDM exception for any use, with a right-holder opt-out – modelled on the recent TDM exception introduced in the EU. This would provide rights holders with the right to opt-out individual works, sets of works, or all of their works if they do not want them to be mined.
  • Adopt a TDM exception for any use, with no right-holder opt-out – similar to an exception in Japan for information analysis, and also in Singapore.

In June 2022, the UKIPO published the Government’s response to the consultation, which was in favour of the widest and most liberal of the options under discussion, i.e., a TDM exception for any use, with no right-holder opt-out. Specifically, it was noted that the widening of the exception would ensure that the UK's copyright laws were "among the most innovation-friendly in the world", allowing "all users of data mining technology [to] benefit, with rights holders having safeguards to protect their content". The main safeguard identified for rights holders was the requirement for lawful access.

Following widespread criticism, however, in particular relating to concerns from the creative industries, the then Minister for Science, Research and Innovation confirmed in February 2023 that the proposals would not proceed.

However, following the Sir Patrick Vallance Pro-Innovation Regulation of Technologies Review on Digital Technologies, which called upon the Government to announce a clear policy position, the Government's response confirmed that it had asked the UKIPO to produce a code of practice. The code of practice is intended to provide balanced and pragmatic guidance to AI firms to access copyright-protected works as an input to their models, whilst ensuring protections are in place on generated outputs to support right holders such as labelling. The Government suggests that an AI firm that commits to the code of practice can expect to have a reasonable licence offered by a rights holder. If a code of practice cannot be agreed or adopted, however, legislation may have to be implemented.

Members of Parliament continue to express their views on this issue. In an interim report on governance of AI by the House of Commons Science, Innovation and Technology Committee (dated 31 August 2023), 'the Intellectual Property and Copyright Challenge' was identified as one of the 12 challenges of AI governance. Representatives of the creative industries reported to the Committee that they hoped to reach a mutually beneficial solution with the AI sector, potentially in the form of a licensing framework. Meanwhile, in its report on Connected tech: AI and creative technology (dated 30 August 2023), the House of Commons Culture, Media and Sport Committee welcomed the Government's rowing back from a broad TDM exception, suggesting that the Government should proactively support small AI developers, in particular, who may find it difficult to acquire licences, by considering how licensing schemes can be introduced for technical material and how mutually beneficial arrangements can be agreed with rights management organisations and creative industry bodies. Further, it stressed to the Government that it "must work to regain the trust of the creative industries following its abortive attempt to introduce a broad text and data mining exception".

In its response to the House of Commons Culture, Media and Sport Committee's report on AI and the creative industries, the Government has confirmed that it is not proceeding with a wide text and data mining exception and reiterates its commitment to developing a code of practice to "enable the AI and creative sectors to grow in partnership". The timeline for finalising the code has however now slipped however to 'early 2024'. We report on this more in this article.  

In the report of the House of Lords Communications and Digital Committee on 'Large Language Models and Generative AI' (published 2 February 2024), the Committee noted that the voluntary IPO-led process was welcome and valuable but that debate could not continue indefinitely, and if process remained unresolved by Spring 2024, the Government must set out options and prepare to resolve the dispute definitively, including legislative change if necessary. However, following reports in The Financial Times that the code of practice had been shelved, this has been confirmed by the Government in its response to the AI White Paper consultation published on 6 February. DSIT and DCMS ministers will now lead a period of engagement with the AI and rights holder sectors and will report on the way forward shortly. This will include exploring mechanisms for providing greater transparency for rights holders.

Impact

The Government had initially indicated that the code of practice would be published in Summer 2023 but later said it was expected in 'early 2024'. The Code of Practice has now been shelved pending further engagement with relevant stakeholders.

8 February 2024

UK approach to copyright protection of computer-generated works

UK

Monitor for developments

Summary

In contrast to the approach adopted in most other countries, copyright is available in the UK to protect computer-generated works (CGWs) where there is no human creator. The author of such a work is deemed to be the person by whom the necessary arrangements for the creation of the work were undertaken, and protection lasts for 50 years from the date when the work was made.

How this applies in relation to content created with generative AI is currently untested in the UK.  In its consultation in 2021, the Government sought to understand whether the current law strikes the right balance in terms of incentivising and rewarding investment in AI creativity. 

Some have criticised the UK provision for being unclear and contradictory – a work, including a CGW, must be original to be protected by copyright, but the test for originality is defined by reference to human authors, and by reference to human traits such as whether it reflects their 'free and expressive choices' and whether it contains their 'stamp of personality'. 

From an economic perspective, meanwhile, it has been argued that providing copyright protection for CGWs is excessive because the incentive argument for copyright does not apply to computers. Further, some argue from a philosophical viewpoint that copyright should be available to protect only human creations, and that granting protection for CGWs devalues the worth of human creativity.

The consultation proposed the following three policy options, with the Government ultimately deciding to adopt the first option of making no change to the existing law at present:

  • Retain the current scheme of protection for CGWs
  • Remove protection for CGWs
  • Introduce a new right of protection for CGWs, with a reduced scope and duration

Impact

Having consulted, the Government decided to make no changes to the law providing copyright protection for CGWs where there is no human author, but said that this was an area that it would keep under review. In particular, it noted that the use of AI in the creation of these works was still in its infancy, and therefore the impact of the law, and any changes to it, could not yet be fully evaluated.

In view of recent developments, it is clear that this policy approach may need to be revisited sooner rather than later.

We discussed this and the comparison with the approach in the US in our article here (and see further below).

16 January 2024

EU AI Act

EU

Political agreement reached in trilogue discussions: 9 December 2023

European Commission Q&A: 12 December 2023

European Parliament approved AI Act: 13 March 2024

European Council approved AI Act: 21 May 2024

Summary

On 9 December 2023, the European institutions reached a provisional agreement on the EU AI Act. The Act has now been approved by both the European Parliament and the European Council.

In relation to copyright, the Act contains provisions relating to obligations on general-purpose AI systems around compliance with EU copyright law (including relating to text and data mining and opt-outs under the EU Digital Single Market Copyright Directive) and transparency around content used to train such models (in the form of sufficiently detailed summaries, which will be by reference to a form template to be published by the proposed AI Office). There is also a requirement that certain AI-generated content (essentially 'deep fakes') be labelled as such.

Impact

The Act will enter into force 20 days after publication in the Official Journal, and be fully applicable 24 months after its entry into force (though certain provisions will enter into force earlier, and others at 36 months). There are staggered dates for when different parts of the Act will take effect:

  • 6 months after coming into force, provisions concerning banned AI practices take effect
  • 1 year after coming into force, provisions on penalties, confidentiality obligations and general-purpose AI take effect
  • 2 years after coming into force, the remaining provisions take effect
  • 3 years after coming into force, obligations for high-risk AI systems forming a product (or safety component of a product) regulated by EU product safety legislation apply
3 January 2024

USCO Statement of Practice

US

USCO Statement of Policy: 10 March 2023

Summary

In March 2023, the US Copyright Office published a Statement of Policy setting out its approach to registration of works containing material generated by AI.

The guidance states that only the human created parts of a generative AI work are protected by copyright. Accordingly, only where a human author arranges AI-generated material in a sufficiently creative way that ‘the resulting work as a whole constitutes an original work of authorship’ or modifies AI-generated content ‘to such a degree that the modifications meet the standard for copyright protection,’ will the human-authored aspects of such works be potentially protected by copyright. 

This statement follows a decision by the USCO on copyright registration for Zarya of the Dawn ('the Work'), an 18-page graphic novel featuring text alongside images created using the AI platform Midjourney. Originally, the USCO issued a copyright registration for the graphic novel before undertaking investigations which showed that the artist had used Midjourney to create the images. Following this investigation (which included viewing the artist’s social media), the USCO cancelled the original certificate and issued a new one covering only the text as well as the selection, coordination, and arrangement of the Work’s written and visual elements. In reaching this conclusion, the USCO deemed that the artist’s editing of some of the images was not sufficiently creative to be entitled to copyright as a derivative work.

Impact

The boundaries drawn by the USCO in relation to works created by generative AI confirm there are challenges for those that wish to obtain protection for such works. Developments should continue to be tracked, including in relation to ongoing litigation (see above).

9 August 2023

USCO Notice of inquiry and request for comments

US

Notice of inquiry and request for comments: 30 August 2023 (deadline for comments: extended to 6 December 2023)

Summary

As part of its study of the copyright law and policy issues raised by AI systems, the USCO is seeking written comments from stakeholders on a number of questions. The questions cover the following areas:

  1. The use of copyrighted works to train AI models – the USCO notes that there is disagreement about whether or when the use of copyrighted works to develop datasets is infringing. It therefore seeks information about the collection and curation of AI datasets, how they are used to train AI models, the sources of materials and whether permission by / compensation for copyright owners should be required.
  2. The copyrightability of material generated using AI systems – the USCO seeks comment on the proper scope of copyright protection for material created using generative AI. It believes that the law in the US is clear that protection is limited to works of human authorship but notes that there are questions over where and how to draw the line between human creation and AI-generated content. For example, a human's use of a generative AI tool could include sufficient control over the technology – e.g., through selection of training materials, and multiple iterations of prompts – to potentially result in output that is human-authored. The USCO notes that it is working separately to update its registration guidance on works that include AI-generated materials.
  3. Potential liability for infringing works generated using AI systems – the USCO is interested to hear how copyright liability principles could apply to material created by generative AI systems.  For example, if an output is found to be substantially similar to a copyrighted work that was part of the training dataset, and the use does not qualify as fair use, how should liability be apportioned between the user and the developer?
  4. Issues related to copyright – lastly, as a related issue, the USCO is also interested to hear about issues relating to AI-generated materials that feature the names of likeness, including vocal likeness, of a particular person; and also in relation to AI systems that produce visual works 'in the style' of a specific artist.

Impact

The issues raised in the Notice are wide-ranging and some are before the Courts for determination. One key issue to resolve is whether the use of AI in generating works could be regarded as akin to a tool like a typewriter in creating a manuscript. Using a typewriter does not result in the manuscript being uncopyrightable in the same way as using Photoshop does not result in a photo taken by a photographer being uncopyrightable. This is the approach that GitHub takes in respect of its Copilot service (for example) where it notes that "Copilot is a tool, like a compiler or pen" and, as a result, its position is that the code produced from GitHub Copilot's should belong to the individual who used the tool. However, again, the legal position as to authorship/ownership is not so clear-cut. Whilst GitHub has no interest in owning Copilot-generated source code that is incorporated into a developer's works, it's not clear whether the terms in Copilot's terms of use effectively assign IP rights to the developer. It is also not clear whether there could be any instances where the use of extensive and carefully worded prompts could ever result in someone being able to claim copyright in the material generated by an AI tool by claiming that the author has ultimate creative control over the work. The USCO had previously considered this in its Statement of Practice. These are just a few issues on which clarity is needed.

Subscribe to our mailings

Keep up to date with news, publications and briefings

Subscribe
How can we help you?
Help

How can we help you?

Subscribe: I'd like to keep in touch

If your enquiry is urgent please call +44 20 3321 7000

I'm a client

I'm looking for advice

Something else