Lee Fang

Lee Fang

What Happened to Piracy? Copyright Enforcement Fades as AI Giants Rise

"Laws are spider webs through which the big flies pass and the little ones get caught." -Balzac

Lee Fang's avatar
Lee Fang
Nov 04, 2025
∙ Paid

The artificial intelligence revolution threatens to uproot entire professions, replace millions of workers, and reshape industrial relations. Much ink has been spilled on the ways in which AI influence has already taken root. Wall Street and Washington, D.C., are both betting on the technology to power the future of American innovation.

But little has been said about how the industry has already seized control over key components of American governance. Look no further than the quiet shifts in the application of copyright law.

Since the mid-nineties, software giants led by Microsoft have waged a global war against copyright infringement and online piracy. They bankrolled groups like the Business Software Alliance to demand increased penalties for copyright violations and pressured FBI agents to raid foreign hosts accused of harboring illicit content-sharing servers. For the old software model, duplicated Microsoft Office disks and fake software licenses posed the greatest risk.

Then-Microsoft Deputy General Counsel Brad Smith, in a 2001 interview with the Wall Street Journal, championed the crusade against digital theft as part of a sprawling battle against “organized criminal enterprises.” The company and its allies marshaled their resources to encourage the federal government to crack down on foreign piracy sites, especially illegal file-sharing firms based in Russia, Hong Kong, and Brazil.

In a case that signified this old era of aggressive copyright enforcement, the Justice Department in 2011 pursued criminal charges against Aaron Swartz, a young open internet activist, for downloading JSTOR’s repository of scholarly papers without authorization. Faced with the prospect of decades in prison, he died by suicide during the prosecution.

Much has changed since advances in artificial intelligence have made the technology the focal point of Silicon Valley innovation. Smith is now president of Microsoft, and the company and its partner OpenAI—which exclusively runs on Microsoft’s Azure cloud computing network and was backed with $13.75 billion in investment funds from Microsoft—are at the center of a very different type of copyright dispute. This time, as the power of the tech industry still looms over Washington, D.C., prosecutors are less interested in going after those suspected of engaging in illegal downloads of copyrighted work.

That is because it is now the tech giants that are accused of exploiting pirated content on an industrial scale. Meta, Anthropic, Microsoft, Google, xAI, and OpenAI are competing to vacuum up as much data as humanly possible in a race to develop their respective AI models. The most prized training data, it turns out, are vast quantities of copyrighted material, largely in the form of published works such as academic articles, novels, and nonfiction books.

After decades of FBI warnings about copyright violations and the dangers of piracy, suddenly the federal government is no longer interested in such crimes. That has left law enforcement in the hands of civil litigation class actions, many of which have been filed by authors and writers noting that tech giants are now plundering their works for AI training without authorization, payment, or notification.

The court cases have cast a spotlight on a stratospheric level of hypocrisy. Microsoft, which once cast peer-to-peer and dark web piracy sites as an existential threat that cost the economy billions of dollars in damages, allegedly taps the very same types of illicit forums for a huge range of copyrighted academic articles, novels, and nonfiction.

The AI giants have all but admitted that they have developed their most advanced models by tapping into mass piracy. The lawsuit Kadrey et al. v. Meta Platforms revealed that Meta, the parent company of Facebook, used a mirror of Library Genesis, a notorious library of pirated books hosted on Russian servers, to train its generative AI systems.

Tech executives have pressed for licensing deals with some publishers—but in many cases have gone ahead with simply stealing millions of books and articles via known piracy sites on the dark web and other illicit forums. The litigation produced emails and documents showing Meta employees admitting that “torrenting from a [Meta-owned] corporate laptop doesn’t feel right 😃.” In one exchange, engineers noted that use of the illegal content had been escalated to Meta CEO Mark Zuckerberg (referred to as “MZ”) and that the decision was “approved to use.”

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Lee Fang
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture