Several fiction and nonfiction authors have launched legal action against tech companies over the alleged misuse of their work to train AI programs. But the New York Times’s lawsuit against OpenAI and Microsoft marks one of the largest to date and raises questions about how the companies might defend themselves in a legal battle that could set a precedent for how AI models are built.
The suit was filed Friday in Manhattan federal court by two nonfiction authors who say the companies’ AI models rely on being trained by ingesting massive amounts of text, including books by Nicholas Basbanes and Nicholas Gage. The plaintiffs’ proposed class action accuses the companies of violating their copyrights and seeks damages of up to $150,000 per book. It also names them for allegedly committing violations of privacy, invasion of seclusion, larceny/receipt of stolen property, conversion, and unjust enrichment.
In the complaint, the authors argue that the companies violated their copyrights by including tens of thousands of pages of the books they allegedly copied without permission. The lawsuit states that the company’s AI model was programmed to learn their writing style, and it then used the books in its training to produce outputs such as the chatbot ChatGPT and other AI-based services.
According to the lawsuit, the authors were not notified or compensated for using their works in the training process. The plaintiffs are seeking statutory damages in addition to punitive damages. The lawsuit follows similar legal actions by several writers, ranging from comedian Sarah Silverman to “Game of Thrones” author George R.R. Martin, who have sued various tech companies for allegedly using their work to train AI programs.
However, Microsoft and OpenAI have not yet responded publicly to the allegations in the new lawsuit. That silence may indicate several things, including strategic considerations or ongoing internal discussions. It could also signal that the companies intend to wait until a more significant legal battle has occurred before addressing this issue publicly.
While licensing AI models might become a reality in the future, it’s unclear whether that would change the broader legal landscape around how copyrighted material is used in building these tools. The New York Times lawsuit also seeks the destruction of its journalism from the GPT large language model and other AI training datasets that incorporate it, which could raise questions about how the companies might defend themselves in an eventual legal battle over this matter.