Multiple authors, including Pulitzer Prize-winning Michael Chabon, have filed class action lawsuits against AI companies Meta and OpenAI, alleging that the companies used their copyrighted works without permission to train their artificial intelligence (AI) systems. The lawsuits claim that Meta’s Llama AI software and OpenAI’s ChatGPT have infringed upon the authors’ intellectual property rights.
AI tools like Llama and ChatGPT rely on a training dataset to generate output based on the text they have been exposed to. The authors argue that Meta and OpenAI used their works, including those still protected by copyright, to train their AI models without consent, credit, or compensation. Specifically, the authors claim that Meta copied material from their books and included it in the dataset used to train Llama. The lawsuits also allege that OpenAI similarly used the authors’ works to teach ChatGPT how to respond to user prompts.
Both Meta and OpenAI have faced criticism for their dataset acquisition methods. In Meta’s case, the company admitted to using a “Books” category in its dataset, which reportedly includes material from Project Gutenberg and a section of ThePile called Books3. Although Project Gutenberg offers books that are no longer under copyright, the origins and legality of the Books3 section from ThePile, allegedly sourced from a shadow library, are uncertain.
The authors argue that their works were used without their consent and demand statutory damages, restitution of profits, and other remedies provided by law. These class action lawsuits are part of a growing trend where AI companies face legal challenges for their use of copyrighted material in training AI systems.
As the lawsuits progress, the debate surrounding the use of copyrighted works in AI training datasets continues. While AI companies argue that the transformation of the original works qualifies as fair use, authors insist that their rights have been violated. This legal battle highlights the need to find a balance between the advancement of AI technology and the protection of intellectual property rights.
Q: What are the class action lawsuits about?
A: The class action lawsuits involve authors accusing AI companies Meta and OpenAI of using their copyrighted works without permission to train their AI systems.
Q: Which authors are involved in the class action lawsuits?
A: The plaintiffs include Pulitzer Prize-winning author Michael Chabon, playwright and Grammy Award winner David Henry Hwang, author Matthew Klam, author and Grammy Award and Golden Globe nominee Ayelet Waldman, and author Rachel Louise Snyder.
Q: What do the authors demand in the lawsuits?
A: The authors are seeking statutory damages, restitution of profits, and other remedies provided by law.
Q: What dataset did Meta use to train Llama?
A: Meta reportedly used a dataset that included material from Project Gutenberg and a section of ThePile called Books3.
Q: What is the legal debate surrounding the use of copyrighted works in AI training datasets?
A: AI companies argue that the transformation of the original works qualifies as fair use, while authors maintain that their intellectual property rights have been violated.