In December 2023, Getty Images walked into federal court in New York and filed what may turn out to be the most consequential lawsuit in the short, chaotic history of generative AI. The target was Stability AI, maker of Stable Diffusion. The allegation was straightforward: Stability AI built a billion-dollar product by scraping and training on Getty’s copyrighted photographs — millions of them — without asking, without paying, and without so much as a thank-you note. Getty wanted damages. The AI industry wanted the whole thing to go away under fair use.
As of early 2026, neither side has gotten what it wanted. The case has ground through motions, procedural skirmishes, and legal filings that would bore anyone not billing by the hour — but no court has yet ruled on the central question everyone actually cares about: is training an AI model on copyrighted images legal? The answer to that question will shape the economics of every image-generating AI on the market, the livelihoods of professional photographers and illustrators, and the fundamental architecture of how these models get built. And right now, nobody knows the answer.
What we do know is that this isn’t just one lawsuit. It’s a wave — and the wave is still building.
Stable Diffusion didn’t emerge from nowhere. It was trained on LAION-5B, a dataset of roughly five billion image-text pairs scraped from the public internet. The dataset was assembled with breathtaking speed and made freely available to researchers — a choice that accelerated AI image generation dramatically, and that also, critics argue, transferred an enormous amount of creative labor into training data without the knowledge or consent of the people who produced it.
Getty Images was not subtle about its displeasure. In its December 2023 complaint filed in the U.S. District Court for the Southern District of New York, the company alleged that Stability AI had copied more than 12 million photographs from Getty’s library, complete with watermarks — a detail that became something of an exhibit A in the court of public opinion, since early Stable Diffusion outputs occasionally hallucinated distorted versions of the Getty watermark, suggesting the model had literally trained on watermarked stock images.
“Stability AI’s technology was built on the back of our content without permission, without credit, and without compensation.” — Getty Images, official statement, December 2023
Getty’s complaint wasn’t just about the training. It alleged violations of the Digital Millennium Copyright Act, trademark infringement, and unfair competition — a kitchen-sink approach that covered multiple legal theories in case one of them didn’t survive early motions. The company was also, simultaneously, pursuing a separate case against Stability AI in the UK High Court, which added a transatlantic dimension to an already complicated situation.
Within the same three-month window, the legal system had also absorbed the Authors Guild class action against OpenAI and Microsoft (September 2023), a visual artists class action against Stability AI, Midjourney, and DeviantArt (November 2023), and the New York Times’ lawsuit against OpenAI and Microsoft (December 2023). The AI industry went from operating in a legal gray zone to facing simultaneous lawsuits on multiple fronts, filed by some of the most well-resourced plaintiffs in the creative economy.
Every AI company defending these cases reaches for the same legal shield: fair use. The argument, stated simply, is that training a model on copyrighted material is transformative — that the model learns patterns and concepts rather than reproducing the original works, and that this learning process is analogous to how a human artist absorbs influences without licensing every painting they’ve ever seen.
It’s a coherent argument. It’s also one that courts have never actually tested in the context of machine learning at scale, which is precisely why everyone from rights holders to venture capitalists is watching these cases so carefully.
Fair use in U.S. copyright law comes down to four factors: the purpose and character of the use (is it transformative?), the nature of the copyrighted work, the amount taken, and the effect on the market for the original. AI companies score reasonably well on the first factor and poorly on the fourth — if Stable Diffusion can generate a stock photo that replaces a Getty image in a client’s workflow, that’s a market substitution effect that courts take seriously.
Legal scholars who’ve analyzed the cases note that the transformativeness argument, while plausible, has never been tested at the scale AI companies are operating at. Scraping billions of images isn’t analogous to a painter walking through a museum. The mechanized, systematic nature of it — and the commercial product that results — puts it in territory courts haven’t mapped.
The Authors Guild case introduced a different dimension: the books used to train large language models weren’t just borrowed for inspiration, the complaint alleged, but functionally memorized in ways that allow the models to reproduce substantial portions of copyrighted text. If courts accept that argument for books, the visual artists’ equivalent — that image models can reproduce recognizable stylistic elements or even specific compositions — becomes more credible.
Here’s the frustrating reality for everyone hoping for legal clarity: the cases have moved slowly, and the substantive copyright questions have mostly not been adjudicated yet. The Getty Images case in New York has proceeded through early procedural phases, with both sides filing motions that shape the scope and framing of the case, but no judge has yet ruled on whether what Stability AI did was legal.
The visual artists’ class action had a rougher start — a federal judge in San Francisco dismissed several claims in late 2023, finding that the plaintiffs hadn’t sufficiently established direct copyright infringement claims. But the case wasn’t thrown out entirely, and the plaintiffs were given the opportunity to amend their complaint. The core copyright question survived.
In the New York Times case against OpenAI, early rulings have allowed the case to proceed on its central copyright claims — a meaningful signal that courts aren’t prepared to accept AI companies’ fair use arguments at the motion-to-dismiss stage. The Times case involves text rather than images, but the underlying legal framework is the same, and a ruling in that case will inevitably inform the image-focused litigation.
What hasn’t happened — and what the AI industry has been quietly banking on — is a definitive ruling that AI training constitutes fair use as a matter of law. That ruling hasn’t come. Instead, what’s emerged is a messy, multi-front legal battle that has so far produced more uncertainty than clarity.
Behind the public posturing, settlement pressure is real on both sides. Stability AI has faced significant financial turbulence since these lawsuits were filed — the company went through leadership changes, layoffs, and reported cash flow problems that became public knowledge in 2023. Litigating simultaneous high-stakes copyright cases is expensive, and Stability AI’s war chest is considerably smaller than OpenAI’s or Google’s.
Rights holders, meanwhile, have their own incentives to settle rather than wait years for a verdict. A settlement that establishes a licensing framework — where AI companies pay to train on image libraries — might be more commercially valuable than a court victory that takes five years and delivers a narrowly written opinion. Getty Images already has a licensing infrastructure. A world where AI companies pay licensing fees to train on stock photo libraries is a world Getty can do business in.
The music industry’s experience with digital licensing offers a rough template. Napster-era lawsuits didn’t just punish infringers — they eventually produced licensing frameworks that, however imperfect, created a new economic model for the industry. The visual AI space may be heading toward something similar: a negotiated settlement of the legal questions that produces industry-wide licensing norms, rather than a clean court ruling that definitively answers the fair use question.
The problem with that outcome, from a legal clarity standpoint, is that settlements don’t make precedent. If Getty and Stability AI settle — as they may well do — the underlying copyright question remains unresolved for every other AI company, every other image model, every other rights holder. The next lawsuit starts from scratch.
One of the stranger aspects of this legal standoff is that the AI image generation market didn’t wait for courts to sort things out. Between the filing of the Getty lawsuit in December 2023 and early 2026, the field has produced Midjourney V7, Flux, Imagen 4, Grok Imagine, and a dozen other capable image generators — all trained on datasets whose legal status remains, to put it charitably, contested.
Some companies have tried to get ahead of the problem. Adobe Firefly was explicitly trained on licensed and public domain images — a deliberate product decision that let Adobe market Firefly as commercially safe in a way competitors couldn’t claim. The pitch worked, at least as a differentiator, even if Firefly’s output quality initially trailed models trained on vastly larger datasets.
Getty Images itself launched an AI image generator built on licensed content, a move that doubled as a product launch and a demonstration of what a rights-respecting training pipeline could look like. The images it produces are fine. The gesture was pointed: here’s proof that you don’t have to scrape to build an image model, you just have to pay for it.
The rest of the industry has largely continued operating on the assumption that fair use will eventually cover them, or that settlements will be manageable, or — and this is the honest version — that by the time courts rule definitively, the technology will have moved far enough that the specific training data decisions made in 2021 and 2022 will be commercially irrelevant.
That last assumption is worth interrogating, because it’s the one the industry leans on most heavily when pressed. The argument is essentially: the models exist, they’re deployed, the training data is baked in — even a ruling that training on copyrighted images was infringement can’t un-train Stable Diffusion or GPT-5. What it can do is affect future model development, and that’s where the stakes are genuinely high.
If courts eventually rule that AI training on copyrighted images constitutes infringement without a license, the economics of building image models change significantly. Licensing billions of images at commercial rates isn’t free. It creates barriers to entry that favor large, well-capitalized companies over startups and open-source projects. It potentially fragments the training data landscape by geography, since licensing terms vary by jurisdiction. And it raises the question of retroactive liability — whether companies that built their models on scraped data owe damages for that historical use even if they switch to licensed data going forward.
Conversely, a clear ruling that AI training is fair use would be a green light for continued large-scale scraping, with no legal obligation to compensate the creators whose work goes into the training pipeline. For photographers, illustrators, and stock agencies, that outcome would be economically devastating — not just because of lost licensing revenue, but because AI models trained on their work compete directly with them in the market for images.
Neither outcome is clean. Both have significant consequences for how AI development proceeds. And both are still fully in play.
The Getty case is the most prominent of the image-focused lawsuits, but legal observers watching the full landscape tend to focus on a few others as potential bellwethers. The New York Times case, despite involving text rather than images, will produce some of the first substantive appellate-level reasoning about AI training and copyright — reasoning that judges in image cases will read carefully. The visual artists’ class action, if it survives to trial, would require a court to engage directly with the question of whether AI image generation can constitute derivative infringement.
The UK dimension of the Getty case adds a wrinkle: UK copyright law handles fair use differently than U.S. law, and a ruling from the UK High Court — which might come sooner, given different procedural timelines — would carry weight in international AI policy discussions even without binding U.S. precedent.
IP scholars following these cases have consistently made one observation: regardless of how individual cases resolve on procedural or narrow factual grounds, the underlying questions about AI training and copyright will keep generating new lawsuits until either Congress acts or an appellate court delivers a ruling broad enough to settle the basic framework. Neither of those things has happened yet.
The honest answer is that the AI industry is operating on legal foundations that haven’t been tested, in cases that haven’t concluded, under a copyright framework that wasn’t designed with machine learning in mind. That’s not a comfortable position, and the discomfort is appropriate.
What makes the current situation genuinely strange is that the companies with the most exposure — the ones that trained the largest models on the broadest datasets — are also the ones with the most resources to litigate indefinitely, settle strategically, or lobby for legislative solutions. Smaller players and open-source projects, which often have more ambiguous legal exposure and less ability to manage it, are left navigating uncertainty that larger companies can absorb.
For photographers and illustrators watching this unfold, the stakes are more personal than procedural. The question isn’t just whether a court will eventually rule in their favor — it’s whether a ruling, whenever it comes, will translate into actual compensation for work already used, or just shape the rules for future models that have already been built. The legal system moves at a pace that the AI development cycle has already lapped several times over.
Getty Images v. Stability AI will eventually resolve — through settlement, through trial, or through some legislative intervention that makes the whole question moot. What it won’t do, on its own, is settle the broader question of who owns the visual record of the internet and what rights attach to it when you feed it into a neural network. That question is going to need more than one lawsuit to answer. Judging by the current docket, it’s going to get several more tries.
