Microsoft seeks dismissal of NY Times’ lawsuit

Last updated on April 5th, 2024 at 10:09 am

The technology behemoth likened the lawsuit to Hollywood’s unsuccessful fight against the VCR, calling it short-sighted

Microsoft has responded to a copyright infringement lawsuit from the New York Times, which accuses the tech giant and ChatGPT-maker OpenAI of unlawfully using Times content to train generative artificial intelligence. Microsoft called the lawsuit a false narrative of “doomsday futurology” and likened it to Hollywood’s unsuccessful fight against the VCR, describing the claim as short-sighted.

In a motion to dismiss part of the lawsuit filed on Monday, Microsoft scoffed at the Times’ claim that its content receives special emphasis and that tech companies are trying to “free-ride” on the Times’ journalism investment.

The lawsuit, which could have significant implications for the future of generative artificial intelligence and news content production, alleges that Microsoft, as OpenAI’s largest investor, used the Times’ copyrighted material, including news articles, investigations, opinion pieces, and more, to create AI products that threaten the Times’ ability to provide its services.

In its response, Microsoft likened the lawsuit to Hollywood’s resistance against the VCR, which consumers used to record TV shows and which the entertainment business feared would disrupt its economic model in the late 1970s.

Quoting from Jack Valenti’s congressional testimony in 1982, Microsoft said, “‘The VCR is to the American film producer and the American public as the Boston strangler is to the woman home alone.'” In this instance, Microsoft argued, the Times was trying to challenge the latest technological advance, the Large Language Model, using its influence and platform.

Microsoft’s lawyers also contended that the content used to train these models does not replace the market for the original works; rather, it teaches the models language.

OpenAI has requested a judge to dismiss portions of the lawsuit against it, claiming that the publisher “paid someone to hack OpenAI’s products” to fabricate instances of copyright infringement involving its ChatGPT.

“ChatGPT is in no manner a replacement for a subscription to The New York Times,” OpenAI’s attorneys stated. “In practice, individuals do not use ChatGPT or any other OpenAI product for such a purpose. Moreover, it is not feasible to use ChatGPT to present Times articles at one’s discretion in regular circumstances.”

After Microsoft filed its legal response, the Times responded, criticizing Microsoft’s use of the 1980s home-taping technology analogy.

Ian Crosby, lead counsel for the New York Times, stated in an email, “Microsoft does not dispute that it collaborated with OpenAI to copy millions of The Times’ works without permission to develop its tools.” He added that Microsoft’s comparison of LLMs to the VCR was odd, as VCR makers never argued that massive copyright infringement was necessary to build their products.

Crosby continued, “Despite Microsoft’s attempts to characterize its relationship with OpenAI as a mere ‘collaboration,’ in reality, as The Times’ complaint states, the two companies are closely connected in developing their generative AI tools.”

The legal battle occurs amidst a wave of lawsuits from authors and artists regarding copyright issues related to AI-generated creative work. There are also concerns about AI’s capability to produce highly misleading content, known in the industry as “hallucinations.”

Recently, Google faced backlash when its Gemini chatbot generated images depicting Black soldiers in World War II-era German military uniforms and Vikings wearing traditional Native American attire. Google apologized and temporarily disabled the technology’s ability to create such images, promising to address inaccuracies in historical depictions.

The dual concerns about AI’s potential to breach copyright and generate highly improbable content arise as OpenAI recently stated that training AI models without copyrighted works is “impossible,” given that copyright law now covers nearly all forms of human expression. OpenAI has declined to reveal the contents of its training datasets, including for its latest tool, a video generator named Sora.

In a letter to the UK’s House of Lords, OpenAI also argued that restricting training data to public domain content would not result in AI systems that meet the needs of modern society.

OpenAI’s CEO, Sam Altman, expressed surprise at the Times’ lawsuit, noting that the system did not require Times data to train itself. He stated, “I think this is something that people don’t understand. Any one particular training source doesn’t significantly impact us.” Altman claimed that the Times’ articles constituted a small portion of the text corpus used to create ChatGPT.