Nvidia is facing a potential class-action lawsuit over its NeMo Megatron AI model. Three novelists sued the company for alleged copyright infringement, arguing that Nvidia has used their work to train its model and has therefore violated their books’ copyright protections.

The authors argue that Nvidia’s NeMo Megatron-GPT, first released in September 2022, copies and draws from their books “without consent, without credit, and without compensation.”

“During training, the LLM copies and ingests each textual work in the training dataset and extracts protected expression from it,” the complaint reads.

The lawsuit states that Nvidia’s NeMo Megatron large language model (LLM) was trained on EleutherAI’s dataset, dubbed “The Pile,” which consists of 800GB of data including 108GB worth of books. The Pile’s books component is also referred to as “Books3,” which is reportedly made up of more than 196,000 books on “Bibliotik” and includes those of the authors who filed the lawsuit.

Bibliotik is a login-gated “shadow library” of copyrighted books, and Books3 creator Shawn Presser has previously confirmed that Bibliotik’s entire library was used to make the AI dataset. In October 2023, Books3 was subsequently removed from AI data site Hugging Face over copyright concerns, but NeMo continues to use this dataset, the lawsuit argues.

The suit’s plaintiffs—novelists Abdi Nazemian, Brian Keene, and Stewart O’Nan—are seeking damages and requesting a class-action lawsuit so that all other authors whose work was included in the Books3 dataset can join the suit against Nvidia.

Reached for comment, an Nvidia spokesperson told PCMag via email: “We respect the rights of all content creators and believe we created NeMo in full compliance with copyright law.”

Recommended by Our Editors

This lawsuit comes as AI tech firms see their stock prices soar to all-time highs while artists grow increasingly frustrated that their names and work are being used to train AI models without their permission and without being paid.

Nvidia isn’t the only tech firm with an AI model that’s being accused of copyright infringement, though. The New York Times‘s copyright lawsuit against OpenAI and Microsoft over ChatGPT is ongoing. And in the past week, multiple artists have raised concerns that generative AI image tool Midjourney is using their unique styles to create outputs that pull from their bodies of work without their consent, calling it “dehumanizing” and disrespectful.

Editors’ Note: This story has been updated to include comment from Nvidia.

What’s New Now to get our top stories delivered to your inbox every morning.”,”first_published_at”:”2021-09-30T21:30:40.000000Z”,”published_at”:”2022-08-31T18:35:24.000000Z”,”last_published_at”:”2022-08-31T18:35:20.000000Z”,”created_at”:null,”updated_at”:”2022-08-31T18:35:24.000000Z”})”>

Get Our Best Stories!

Sign up for What’s New Now to get our top stories delivered to your inbox every morning.

This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.

[ For more curated Computing news, check out the main news page here]

The post Novelists Sue Nvidia Over Alleged AI Copyright Violations first appeared on

New reasons to get excited everyday.

Get the latest tech news delivered right in your mailbox

You may also like

Notify of
Inline Feedbacks
View all comments

More in computing