DMR News

Advancing Digital Conversations

Authors Sue Nvidia for Using Copyrighted Books to Train AI Without Permission

ByHilary Ong

Mar 11, 2024

Authors Sue Nvidia for Using Copyrighted Books to Train AI Without Permission

Nvidia, a leading chip manufacturer renowned for its pivotal role in powering artificial intelligence (AI) technologies, has been entangled in a legal dispute initiated by three American authors. The authors have accused Nvidia of incorporating their copyrighted books into the training dataset for its NeMo AI platform without obtaining permission.

This lawsuit puts Nvidia at the center of a growing legal challenge regarding the use of copyrighted materials in training generative AI systems, which are designed to produce new content based on various inputs, including text, images, and sounds.

The Core of the Dispute

The controversy surfaced when Brian Keene, Abdi Nazemian, and Stewart O’Nan discovered that their literary works were part of an extensive dataset consisting of approximately 196,640 books. This compilation was utilized to enhance NeMo’s capabilities in simulating conventional written language. The disputed dataset was removed in October following allegations of copyright infringement, a move that the authors interpret as Nvidia’s implicit acknowledgment of the copyright breach.

Filed in San Francisco federal court on a Friday night, the proposed class action lawsuit argues that by using the dataset to train NeMo, Nvidia has infringed upon their copyrights. The authors are seeking unspecified damages on behalf of all U.S. copyright holders whose works may have contributed to training NeMo’s large language models over the past three years.

The legal complaint specifically references:

  • Brian Keene’s 2008 novel Ghost Walk
  • Abdi Nazemian’s 2019 novel Like a Love Story
  • Stewart O’Nan’s 2007 novella Last Night at the Lobster

as examples of the copyrighted content in question.

How Has Nvidia Reacted?

In response to the lawsuit, Nvidia, a company established by Jensen Huang, a Taiwan-born American entrepreneur, opted not to comment. Requests for further comment from the authors’ legal representatives went unanswered.

This legal challenge against Nvidia is part of a broader wave of litigation that has seen various writers, including The New York Times, confront the creators and users of generative AI technologies over copyright issues. These lawsuits have also implicated other significant players in the tech industry, such as OpenAI, the developer of ChatGPT, and its collaborator, Microsoft.

The Impact on Nvidia’s Market Value

The rapid advancement of AI technologies has propelled Nvidia to the forefront of investor interest. Based in Santa Clara, California, the chipmaker’s stock has surged almost 600 percent since the close of 2022, elevating Nvidia’s market valuation to nearly US$2.2 trillion. This legal dispute, however, casts a shadow over Nvidia’s practices in sourcing training data for its AI models, raising critical questions about the ethical and legal dimensions of developing AI technologies.


Related News:


Featured Image courtesy of jetcityimage/Getty Images

Hilary Ong

Hello, from one tech geek to another. Not your beloved TechCrunch writer, but a writer with an avid interest in the fast-paced tech scenes and all the latest tech mojo. I bring with me a unique take towards tech with a honed applied psychology perspective to make tech news digestible. In other words, I deliver tech news that is easy to read.