A recent court decision gave AI developers a roadmap for staying out of copyright trouble—and a big flashing warning sign for what not to do.
The case: Anthropic, the folks behind Claude, trained its AI on a huge stash of books. Some were bought, scanned, and stored. Others came from pirate sites. You can guess where this is going.
The court split the baby:
- Bought and scanned books? Fair use. Transformative. Like teaching a student to write—new output, not copies. Destroy the paper version, keep the searchable digital version, you’re fine.
- Pirated books? Not fair use. Doesn’t matter if you never train on them. If you’re holding unauthorized copies, you’re replacing legitimate sales. That’s infringement, and damages are on the table.
For AI training, the message is simple: If you want fair use on your side, start with legally obtained material. Buy it, borrow it, license it—just don’t snatch it from the high seas of the internet.
Because in this court’s eyes, training on a legit copy is like learning from a library book. Training on a stolen copy? That’s like breaking into the library at night.
At the end of the day: Pay for your inputs. Your model will be smarter—and so will you.