Unfortunately, it isn't the scraping, and use of copyrighted work that was decided to be illegal, instead the mass amount of piracy committed by Anthropic.
Anthropic won partially, stating that copyrighted books can qualify as fair use under federal law, but it doesn't absolve them of taking pirated material, and buying a licensed addition later doesn't dispute the copyright claim. It remains that the piracy claims are independentally actionable. Anthropic's behaviour during the lawsuit shows that they were actually concerned, too, as they attempted to resolve the claims before trial.
Each section of the process is seperate, and while they may succeed at one step, they can fail at the fair use defense in another.
Essentially, if an AI company takes it legally, either through paying, lawful curation, or licensed database they may be able to use it, however, if it's illegally scraped (like much, if not most was), they cannot legally use it. It is unclear on whether the crime was scraping pre-pirated work, or the unlicensed scraping of data from legitimate sources, as it may considered "free use" and "lawful curation," for the latter.
As the article states, it gives future plaintiff's a roadmap, "if you can show your work was scraped from an illicit repository rather than a licensed database, the fair use shield may not apply." This has left AI companies needing licensed alternatives quickly, as they cannot surely rely on pirated datasets, and deals with publishing houses have been made for training data.
It remains unclear, in the article, whether scraping licensed databases without permission is protected under "fair use," or if permission is needed.
Anthropic is continuing to face trial on the piracy-related claims, despite it relying on the fair use shield.
The payout, collectively, came to £1.5b, and £3,000 per book, which means approximately 500,000 books were pirated.
It using fair use is shaky, though, as fair use states that (in UK laws), it must adhere by certain rules, like purpose of use (like educationally or non-commercially), nature of the work (like using factual work is often more legal than creative), amount used, and effect on market value (if it effects authors, as it has, it's less likely to be considered legal). Anthropic has taken large amounts of creative work, affecting authors negatively as they must compete with poorer, AI generated writing made using their own stolen art; seemingly, the only leg they have to stand on (and it's flimsy at best) is purpose of use, which is what they argued, but as it's being allowed to be used to generate creative writing, which directly harms writers, it doesn't seem to be a strong argument.
While, these findings are sure to have ripple-down effects, the finding of scraping copyrighted work not to be illegal may change depending on country, as in the UK, it may be considered commercial use, and therefore copyright infringement.



















