Spokane authors decry ‘egregious theft’ of their work allegedly by OpenAI, other AI companies

Oct. 8, 2023 Updated Mon., Oct. 9, 2023 at 8:45 a.m.

By Amanda Sullender amandas@spokesman.com (509) 459-5455

Upon the release of ChatGPT nearly a year ago, Montana-based author Jamie Ford asked the generative AI to write him the opening page of a novel in the style of Agatha Christie. Then in the style of Pat Conroy. Then in the style of Jamie Ford.

In each case ChatGPT spit back a “vague facsimile” of each author’s style, including his own.

“If you squint, perhaps it could be mistaken for the original author’s work. It is very clunky but it is clearly ripped from existing work. Including mine,” he said.

Ford is just one of tens of thousands of authors whose works have been used without their knowledge or consent to train these generative AI systems how to use the English language to tell a story.

Last month the Authors Guild, a professional organization for published writers, and 20 of America’s most prominent authors launched a class-action lawsuit alleging Open AI, the nonprofit developer of ChatGPT, illegally used copyrighted material to develop their AI systems.

The lawsuit joins several others that have targeted some of the lesser -known generative AI systems, such as Meta’s LLaMA or BloombergGPT, that have been accused of similar practices.

As a class-action lawsuit, the Authors Guild represents a group much larger than those named in the lawsuit. Any author whose copyrighted works were used could receive compensation if there is a settlement or judgment in the case.

Authors named in the lawsuit include all-time bestsellers like John Grisham, Jodi Picoult and George R.R. Martin, among others. Though not named in the lawsuit, several local authors in Spokane and the Northwest signed an open letter in support of the lawsuit.

Sharma Shields called the actions of OpenAI and similar organizations an “egregious theft.” Sharma is the author of two nationally published novels and works for the Spokane Public Library as a writing education specialist.

“There was no permission granted by any of the writers. There was no payment given to any of us writers. And most writers are just scraping by in whatever way we can so we can return to our art and give ourselves wholeheartedly over to the original words we are creating,” Shields said.

According to an income survey conducted by the Authors Guild last year, the medium income collected by full-time authors from their writing is just over $20,000 a year.

OpenAI has not publicly disclosed the number of works that have been used to train its AI or where the books were sourced. Plaintiffs in the lawsuit allege their books were downloaded from pirated e-book repositories online.

Other AI systems are known to have used a specific dataset called Book3, which includes upwards of 183,000 books published over the past 20 years. AI systems using this data base include one being developed by Meta, the parent company of Facebook.

Of her two books, Shields’ novel “The Cassandra” is included in the Book3 dataset. The novel tells the story of a woman working in a classified facility during World War II while her superiors work on the secret of nuclear fusion.

The thought of her story being regurgitated and reworked into a supposedly new work “feels like a personal attack,” according to Shields.

“I had to delve so deeply into my own personal difficulties and trauma to write that book,” she said. “To have it used so very ignorantly and thoughtlessly is abhorrent.”

Other local authors also have novels included in the Book3 dataset. Ford has three novels in the repository. Spokane author Jess Walter has five of his books in the database, including his nonfiction book detailing the infamous Ruby Ridge standoff.

Is this fair use?

Legal arguments around the use of their novels in the development of ChatGPT or any other generative AI system will center around whether the material AI creates constitutes “fair use” of the original work.

Fair use is an exception to legal copyright protections allowing their use if it is for commentary, criticism or any transformational use. At the center of the legal debate is whether AI’s use of these books to create new work is sufficiently transformational to fall under a fair-use exception.

In filings in a separate but similar lawsuit, OpenAI argues it is not infringement of copyright to “create wholesale copies of a work as a preliminary step” as long as that copyrighted work is used to develop a “new, non-infringing product,” which they argue the output of their AI is.

Speaking to The Spokesman-Review, local authors disputed the idea .

“The ‘artificial intelligence’ in AI is highly overrated. It’s less intelligence and more an aggregation of information, which is a fancy way of copying or folding existing information into a larger whole,” Ford said.

ChatGPT and its compatriots are “not creating anything,” he argued.

“As far as I can tell, AI is just a shiny new way to rip off people’s work. A way to pretend a machine is coming up with all this when its just aggregating existing material created by people. And without that aggregated material, these AIs are dead in the water.”

What protections could be negotiated?

With lawsuits aplenty, a resolution of this dispute between authors and tech companies is not likely to be resolved any time soon.

Walter said any AI system created through the use of copyrighted works needs to be “taken down immediately” while a process is negotiated to license copyrighted works for this use in AI technologies.

“The first step is to cease and desist – to take down these products that are using copyrighted material, illegally and inappropriately. And then the second step would be some level of compensation,” Walter said.

Shields said she hopes some type of routinized system can be put in place to license the use of copyrighted works for use in AI systems.

“I’m not quite sure what it would look like. There are so many different systems in place that protect us from having our work pirated. I don’t see why that couldn’t happen here too.”

Asked whether she would license her novels if she were compensated, Shields said she likely would not, but there may be other authors who would.

“A lot of us writers appreciate an opportunity to make more money off of our work,” she said. “So I feel like if that’s being offered, I would not judge a writer for choosing to allow companies to use their work in this way. But for me personally, I would be really hesitant to do that. I think sometimes the only thing I have are my stories.”

Ford said he is not sure such a system is viable at all.

“If your own licensed material can be used to disrupt your own career, that seems really unhelpful,” he said.

What is the future for writers?

Beyond legal questions, Spokane’s authors are concerned about the next generation of writers and creatives in an AI-dominated world.

“If I were a young writer, I would definitely feel threatened,” Walter said.

Shields fears that as AI develops, it could “take over storytelling,” which she sees as “heartbreaking.”

Walter believes writers most at risk are those who work in defined genres with a lot of conventions that might be easier to replicate.

“You can go on Amazon right now and buy a vampire book that’s been written by one of these programs,” he said. “And certainly you’re going to get a terrible vampire story when you trust a computer to write it. But it’s something that could certainly compete with the author who writes those books.”

Ford expressed a similar concern – claiming he is not worried about his own work but other authors.

“I would be really worried right now if my job was to crank out five murder mystery novels a year,” he said.

Ford pointed to TV writers for shows, such as “Law & Order,” where there are decades of scripts for the AI to use to create new episodes within the same novel. Both Ford and Walter cited protections won in the recent Hollywood writers’ strike as a possible model for novelists.

In part, the deal reached by the Writers Guild of America prohibits training AI with screenplays or television scripts without expressed permission from the writer.

Ford also noted AI technology will improve in the coming years and any protections negotiated now should anticipate that.

“You can write a few pages that sound alike but I don’t think an AI can write a 400-page novel right now,” he said. “But in a couple of years that could change. I do think we’ll probably have some sort of general AI in about five years, and it’s going to be a total circus.”

Amanda Sullender can be reached at (509) 459-5455 or by email at amandas@spokesman.com.

Spokane authors decry ‘egregious theft’ of their work allegedly by OpenAI, other AI companies

Is this fair use?

What protections could be negotiated?

Then and Now: Grand Coulee Dam

Getting There: Sprague Avenue project clears final hurdle despite objections

Classics and hot rods of all kind line the street at Harrington Car Show