The largest copyright recovery, ever. First of its kind.
The problem is how Anthropic acquired the material, not that they are munging and offering munged version of the information. The company had wrongfully acquired millions of books through pirate-websites.
Anthropic downloaded more than 7 million digitized books.
What a brilliant business strategy. Instead of paying $10,000,000 to purchase the books legally they instead paid $1,500,000,000 to pirate the books.
Google did the same thing, Meta did the same thing, OpenAi did the same thing.I think the common thing with large companies is to do something illegal and ask for forgiveness later after they already made profit from their wrong doings…
This is not entirely accurate. Even if you have a copyright page in your book that doesn’t mean you will get the settlement. I have a book that they stole, it was the first book I published, but I won’t get paid (and many other self published authors won’t get paid either) because you have to have paid for proof of copyright with the copyright office to get part of the settlement. That’s why only authors from the USA can get it, because other counties don’t have a paid system for copyright proof.
Transcript
Click to reveal
0:00
Welcome once again to Leato's law.
0:02
Here's Steve Leato.
0:04
Edward sent me notes to Steve, check out
0:05
this story. As an author, you might
0:07
appreciate it. And he is correct. A
0:09
company called Anthropic. I could be
0:12
mispronouncing it, but I couldn't find a
0:14
pronouncer on the internet or anthropic
0:16
to pay authors $1.5 billion to settle a
0:20
lawsuit over pirated books used to train
0:23
AI chatbots. So Matt O'Brien wrote this,
0:27
the Associated Press, but the company
0:29
has agreed to pay $ 1.5 billion to
0:32
settle a class action lawsuit by book
0:34
authors who say the company took pirated
0:36
copies of their works to train its
0:38
chatbot. Interestingly though, the court
0:41
did not say that they shouldn't have
0:44
used the material to train the chatbots,
0:46
but the problem is how they acquired the
0:48
material. And so that's what this whole
0:50
thing turns on. The landmark settlement,
0:52
if approved by a judge, could mark a
0:54
turning point in legal battles between
0:56
AI companies and writers and visual
0:59
artists and other creative professionals
1:01
who accuse them of copyright
1:02
infringement for using all that material
1:04
without permission. Companies agreed to
1:06
pay authors or publishers about $3,000
1:09
for each of an estimated half a million
1:12
books covered by the settlement. So, the
1:15
ne lawyer for the author says, "As best
1:18
as we can tell, it's the largest
1:20
copyright recovery
1:23
ever. It's the first of its kind in the
1:25
AI era." Now, three authors sued last
1:28
year. They now represent a broader group
1:31
of writers and publishers whose books
1:34
were downloaded by the company to train
1:36
its chatbot. A federal judge dealt the
1:39
case a mixed ruling back in June,
1:40
finding that training the AI chat bots
1:42
on the copyrighted books wasn't illegal
1:45
by itself, but that the company had
1:48
wrongfully acquired millions of books
1:50
through pirate websites, and that was
1:53
wrong. So, if the company had not
1:56
settled, experts say losing the case
1:58
after a scheduled December trial could
2:00
have cost the company even more money.
2:03
So, a legal analyst says, "We were
2:05
looking at a strong possibility of
2:07
multiple billions of dollars, enough to
2:10
potentially or even put the
2:11
company out of business." A uh district
2:14
judge in San Francisco scheduled a
2:15
Monday hearing to review the settlement
2:18
terms. The company said in a statement
2:20
that the settlement, if approved, will
2:21
resolve the remaining legacy claims will
2:24
resolve the claims. We remain committed
2:27
to developing safe AI systems that help
2:29
people and organizations extend their
2:30
capabilities, advance scientific
2:32
discovery, and solve complex problems,
2:36
says the company's deputy general
2:37
counsel. As part of the settlement, the
2:39
company's also agreed to destroy the
2:42
original book files that it downloaded.
2:45
So books are an important source of data
2:49
that are needed to build the AI large
2:51
language models behind chat bots like
2:54
claude by this company or chat GPT by
2:58
open AI. And so what they've done is
3:00
they've downloaded these you know data
3:03
sets which is billions of words but of
3:06
course the order in which those words
3:07
appear in these books is quite
3:09
important. So the ruling in June found
3:12
that this company had downloaded more
3:13
than 7 million digitized books that it
3:17
knew had been pirated. It started with
3:20
nearly 200,000 from an online library
3:23
assembled by AI researchers outside of
3:25
OpenAI to match the vast collections on
3:28
which chat GPT was trained. Company
3:30
later took at least 5 million copies
3:33
from a pirate website and two million
3:37
from another. That's according to the
3:39
judge. The author's guild told its
3:41
thousands of members last month that it
3:43
expected damages will be minimally $750
3:47
per work or could be much higher if it
3:50
went to trial. Now,
3:52
the settlement's higher award of $3,000
3:56
per work likely reflects a smaller pool
3:58
of affected books because there are
4:00
probably duplicates among those millions
4:03
of books and also those without
4:05
copyright. Now, now I've written a whole
4:08
bunch of books and I assure you they are
4:10
copyrighted.
4:12
Um, recently the CEO of the author's
4:14
guild called the settlement an excellent
4:15
result for authors, publishers, and
4:17
rights holders generally, sending a
4:20
strong message to the AI industry that
4:22
there are serious consequences when they
4:24
pirate authors work to train their AI,
4:26
robbing those least able to afford it.
4:29
Uh, meanwhile, the Danish Rights
4:31
Alliance, which successfully fought to
4:33
take down one of those shadow libraries,
4:35
said Friday that the settlement would be
4:37
of little help to European writers and
4:39
publishers whose works aren't registered
4:40
with the US Copyright Office. So, this
4:43
is going to apply to, in essence,
4:44
American authors. On the one hand, it's
4:48
comforting to see that compiling AI
4:50
training data sets by downloading
4:51
millions of books from known illegal
4:53
file sharing sites comes at a price,
4:55
says the group's head of content
4:57
protection enforcement. On the other
4:59
hand, it fits a tech industry playbook
5:01
to grow a business first and later pay a
5:04
relatively small fine compared to the
5:06
size of business for breaking the rules.
5:08
You know, move fast, break rules. So, it
5:10
is my understanding these companies see
5:12
a settlement like this one as a price of
5:14
conducting business in a fiercely
5:16
competitive space. Now, the privately
5:18
held company we're talking about was
5:20
founded in 2021.
5:22
uh earlier this week put its value at
5:25
$183 billion after raising another $13
5:29
billion in investments. And so
5:34
the um $1.5 billion uh is that's a lot
5:40
of money to you or me, but with that
5:43
kind of money behind them, if it's just
5:44
the cost of doing business, they go,
5:46
"Okay, fine. We we we've got our
5:48
language models now. We don't we don't
5:51
we we're good. We're good. We'll pay the
5:53
1.5 and we'll move on. So the real
5:55
question I have is I've written
5:58
more than 10 books.
6:00
I wonder if any of them got swept up in
6:02
this because if they did, I could get
6:05
some money. And the interesting thing is
6:07
I heard about this case, but I never
6:10
heard about it in the sense that as far
6:11
as I know, I never got notified. And of
6:13
course there are class actions out there
6:15
where if this was a typical class action
6:18
and as a member of the class I might not
6:20
have heard anything yet but I'll have to
6:23
I'll have to look into that. But it's an
6:25
interesting question and so I've known
6:28
that some of my books um are out there
6:31
as ebooks and once they show up out
6:33
there as an ebook then they wind up on
6:34
these pirated sites. I've had other
6:36
books of mine that were never put into
6:37
ebook form. So, you know, uh you know,
6:40
that that wouldn't be a concern, I don't
6:42
think. But I'll have to take a look at
6:44
that and see. So, it's it's it's wild
6:47
because the settlement amount is so
6:49
large, $1.5 billion. And we've often
6:53
talked jokingly about class actions
6:55
where the attorneys will get, you know,
6:57
$79 million, and each member of the
6:59
class gets three cents and a coupon for
7:02
a free soft drink at Big Boy. And um
7:05
makes a lot of sense unless Big Boy is
7:07
not involved in the case. However,
7:12
um here obviously
7:16
it sounds like enough money where I'd go
7:18
cash the check. Um I think I've actually
7:20
gotten some class action settlement
7:22
checks I didn't bother to cash cuz they
7:24
send you a postcard and when you realize
7:27
the check you oh 48 cents do I really
7:31
drag that to the bank and deposit it?
7:34
People are going to go, "Steve, you
7:35
could deposit it electronically." Not
7:38
not back when I got that. It's been a
7:39
while. But I digress. So, if I got a
7:43
check for a couple hundred bucks,
7:44
certainly certainly I would not only
7:46
cash it and deposit it. I would do a
7:48
video about it. So, we'll see if that
7:50
happens. I will keep my eyes peeled. I
7:52
will also do some research. But again,
7:53
it's a company called Anthropic or
7:56
Anthropic. I don't know. And I looked it
7:58
up and according to Wikipedia, their
8:01
actual logo doesn't use an I between the
8:05
P and the C, but they use a a slash.
8:08
I believe it's called a vergul or a
8:10
virtual. But would that be anthropic or
8:14
anthrop? I don't know. I don't know.
8:18
It's a funny thing about a company name.
8:20
It it it should be pronouncable. But
8:24
again, Matt O'Brien wrote that the
8:26
Associated Press. Edward sent me. Thanks
8:28
a lot. Anthropic to pay authors $1.5
8:32
billion to settle lawsuit over pirated
8:34
books used to train AI chatbots. I will
8:36
go do research and see if I'm in the
8:38
class. Questions or comments, put them
8:39
below. Always talk to you later.
8:41
Bye-bye.
8:42
Thank you for watching Leato's Law.