NYT vs Microsoft: Supercomputer and Copyright Infringement

Here's what happened: Microsoft built something. Not just any cloud infrastructure, but an unusually complex supercomputer specifically designed to train OpenAI's language models on copyrighted works without permission. The New York Times is now saying this wasn't passive hosting—it was active encouragement of copyright infringement.

The timing matters. This amended complaint came after the Supreme Court handed down its decision in Sony v. Cox Communications, a case where Sony tried to claim Cox was contributing to music piracy as an internet service provider. The court sided with Cox, establishing a new standard: contributory infringement now requires proof of intentional inducement. Passive contribution isn't enough anymore.

The NYT saw this coming. They knew their original complaint—which painted Microsoft as just another cloud vendor—wasn't going to cut it under the new legal standard. So they amended.

Rewriting the Microsoft Allegation

In 2023, when the NYT first sued OpenAI, they described Microsoft's supercomputing systems as generic cloud services. Standard stuff. Azure hosting, compute cycles, the works.

The amended complaint tells a different story. According to the filing, Microsoft specifically designed this machine for the purpose of using essentially the whole internet—curated to disproportionately feature Times Works—to train what they hoped would be the most capable LLM in history.

Think about that for a second. We're talking about a bespoke system, ranked among the most powerful in the world, built with the explicit purpose of training AI on copyrighted journalism without permission. The NYT alleges that Microsoft not only helped select which works would be infringed but also provided the means to seize copyrighted content without authorization.

Microsoft's response? A spokesperson called it "a last-ditch effort by the plaintiff to save its claim from unfavorable precedent set in other recent rulings."

Fair enough. But the NYT argues neither party would be prejudiced by allowing the amendment. Legal standards changed, they say, and it's proper to revise arguments when that happens. Plus, the case schedule won't be set back because they're not seeking additional discovery.

The Discovery Evidence That Haunts Tech Firms

Here's where it gets messy. During discovery, the NYT produced user logs showing how people were using ChatGPT to bypass their paywall. You know the drill—someone asks for "the next paragraph" and suddenly they're reading full articles without paying a dime.

In some cases, users told ChatGPT they were trying to skirt paywalls and got significant chunks of articles. In others, the models just spit out several paragraphs without any finagling at all.

The NYT shared side-by-side comparisons in their complaints. GPT output next to the original Times article. Almost verbatim. It's not subtle.

And then there are the hallucinations. The complaint lists examples of AI models fabricating articles under fake NYT bylines. One example: ChatGPT claimed to have published an article linking orange juice to non-Hodgkin's lymphoma. Never happened. Made up.

Another involved Bing Chat citing fake quotes from Steve Forbes' daughter Moira Forbes. Also fabricated.

The NYT's argument is straightforward: users who ask a search engine what The Times has written on a subject should get a link to the actual article, not an unauthorized copy or an inaccurate forgery.

Market Harm and the Stakes for AI Training

The market harm argument is where this case could really bite. The NYT alleges that Microsoft's deployment of Times-trained LLMs throughout its product line helped boost its market capitalization by a trillion dollars in the past year alone.

That's a serious claim. And it ties directly into the fair use defense that OpenAI and Microsoft are leaning on.

OpenAI's position? Training AI on publicly available data is indisputably fair use. Their spokesperson Drew Pusateri said their models "empower innovation" and are "grounded in fair use."

But here's the problem for tech firms: one of the earliest verdicts finding that AI training was fair use was explicitly granted due to plaintiffs' failure to prove market harms. Last June, a federal judge laid out what he thinks could be a winning argument against AI training on copyrighted works, suggesting the fair use question is far from answered.

OpenAI has argued that ChatGPT isn't a substitute for a Times subscription because they "transformed the material for a different use." But if the NYT can convince the court that ChatGPT's use isn't so different from the newspaper's own use of its content, we're looking at serious consequences.

The Nuclear Option: Model Destruction

If the court rules in favor of the NYT and rejects the tech firms' fair use defense, we're not just talking about damages. We're talking about model deletion.

The NYT has asked for permanent injunctive relief to prevent future infringement, plus extensive damages. They're insisting that Microsoft and OpenAI "wrongfully profited from copyrighted works that they do not own."

That means wiping models and starting over. For companies that have invested billions in training data and compute, that's an existential threat.

The NYT also voluntarily dropped two claims in this amendment: trademark dilution and one of the contributory copyright infringement claims. They're streamlining, focusing on their strongest arguments. Smart move.

The Global Context: AI Accountability Is Coming

This case doesn't exist in a vacuum. While the NYT is fighting Microsoft and OpenAI in US courts, similar pressures are building globally.

In Germany, a court issued a preliminary injunction against Google, holding the search giant liable for defamatory statements in its AI Overviews. The German court rejected the defense that disclaimers insulate companies. Their reasoning: AI search engines generate "independent, new, and substantive statements" rather than just linking to third parties. And since AI Overviews aren't necessary to search the web, firms must be held liable when they produce false outputs.

Meanwhile, Senator Bernie Sanders unveiled a proposed AI bill that would levy a 50% tax on the stock of major AI firms to raise $7 trillion for a sovereign wealth fund. Learn more in Bernie Sanders’ $7 Trillion AI Takeover plan. The draft legislation would also mandate that tech giants split their AI businesses from their non-AI entities.

If passed, this would align with structural concerns about companies like Microsoft utilizing massive cloud resources to monopolize AI markets. It's a direct response to the kind of vertical integration we're seeing in this copyright case.

What This Means for Publishers and Tech

The NYT's amended complaint represents a strategic pivot. They recognized that the legal landscape had shifted after Sony v. Cox, and they adapted their arguments to meet the new standard of intentional inducement.

By alleging that Microsoft built a customized, curated machine tailored to disproportionately ingest Times works, they're trying to prove active inducement rather than passive contribution.

For publishers, this is a template. If you can show that a tech company didn't just host your content but actively designed systems to maximize its ingestion, you might have a stronger case.

For tech companies, the message is clear: building custom infrastructure for AI training isn't a neutral act. If that infrastructure is designed to prioritize copyrighted works, you're on the hook.

The fair use defense isn't dead, but it's not as strong as companies thought. Market harm matters. Intent matters. And the line between "using publicly available data" and "building a supercomputer to steal copyrighted works" is getting blurrier by the day.

We'll find out soon enough.

The Supercomputer That Changed Everything

From Cloud Host to Active Partner: How a Custom Supercomputer Changed Microsoft's Liability in the NYT Copyright Battle

Rewriting the Microsoft Allegation

The Discovery Evidence That Haunts Tech Firms

Market Harm and the Stakes for AI Training

The Nuclear Option: Model Destruction

The Global Context: AI Accountability Is Coming

What This Means for Publishers and Tech

Related blogs

Chronicles of the AI Revolution: Bradley Olson's Perspective at The Journal

What will it take for personal AI agents to finally click? NEA partner Tiffany Luck joins Equity to discuss

Why Actions Speak Louder Than 'Thank You': The Science of Active Appreciation