AI Image Generation

Cad

I'm With HER ♀
<Bronze Donator>
24,496
45,437
Shame mist didn't actually read my link to that.
No but he did tell you very forcefully that there are NO VALID LEGAL ARGUMENTS.

You did see that right? I don't know why we're even still talking about it. NO VALID LEGAL ARGUMENTS. lol.

Edit: NO VALID LEGAL ARGUMENTS now with more underlining, because underlining makes you correct.
 

Cad

I'm With HER ♀
<Bronze Donator>
24,496
45,437
You're 100% demonstrably wrong.

Our own forum threads do this all the time. Because of the storage limits on this website, much of the content of any of the picture and gif threads are linked from elsewhere, either to Imgur's backend, or Discord's CDNs, etc, or many other websites. That is why images on older pages frequently break.

Further, any website you visit with ads is generally pulling content from a dozen different domains that serve the ad content.

This isn't AOL. The internet got very fast over the past 15 years. Your browser can load content from dozens of different domains very quickly without you even noticing.

And you're still ignoring the point about the fact that if Paramount wanted all of their served content removed from Google, or removed from the caches of various CDNs, they could make it happen. There are mechanisms to allow this to happen.
You're still ignoring that you tried to pass off the href as the image source.

Fuck, thats embarrassing.
 

Denamian

Night Janitor
<Nazi Janitors>
7,200
18,990
No but he did tell you very forcefully that there are NO VALID LEGAL ARGUMENTS.

You did see that right? I don't know why we're even still talking about it. NO VALID LEGAL ARGUMENTS. lol.

Edit: NO VALID LEGAL ARGUMENTS now with more underlining, because underlining makes you correct.
I don't know shit about this anyway. The artists in my family never talk about how they learn how to create their art and how they have to deal with copyright and shit.
 

Mist

Eeyore Enthusiast
<Trapped in Randomonia>
30,474
22,325

Authors Guild, Inc. v. Google, Inc.


Yes, thanks for losing the argument.

1703804714846.png


Further, for all of these copywritten books, Google displays the copyright page of the books. It attributes the content to the original content producer.

LLMs and Diffusion models do none of these things. They attempt to pass the copied content, stored inside of the model, as their own work. Again, Bing Chat improves on this a bit, to their credit.
 

Cad

I'm With HER ♀
<Bronze Donator>
24,496
45,437
Yes, thanks for losing the argument.

View attachment 506223

Further, for all of these copywritten books, Google displays the copyright page of the books. It attributes the content to the original content producer.

LLMs and Diffusion models do none of these things. They attempt to pass the copied content, stored inside of the model, as their own work. Again, Bing Chat improves on this a bit, to their credit.
Screen grabs are snippets. Fucking retard.

The public display of the training data doesn't happen. Fucking retard.

AI generated screen grabs do not provide acceptable substitutes of the original movies. Fucking retard.
 

Mist

Eeyore Enthusiast
<Trapped in Randomonia>
30,474
22,325
Did anybody ask any of the AI companies to remove their content and they refused?
Yes. Many people, repeatedly, including many ongoing lawsuits. It is not technically feasible.

They cannot remove the content from the model. Think of the model as a vast network of interconnected numbers. No one, not even the creators, understands how any given piece of content or any given idea is stored inside the model. The training run produced that math after putting it through an enormous amount of compute, and it is effectively a black box to even the smartest people on the planet.

But when you ask it to reproduce data that it trained on, it complies and produces a near perfect copy, meaning the original is stored inside there using what is effectively an indecipherable compression algorithm.
 

Cad

I'm With HER ♀
<Bronze Donator>
24,496
45,437
Yes. Many people, repeatedly, including many ongoing lawsuits. It is not technically feasible.

They cannot remove the content from the model. Think of the model as a vast network of interconnected numbers. No one, not even the creators, understands how any given piece of content or any given idea is stored inside the model. The training run produced that math after putting it through an enormous amount of compute, and it is effectively a black box to even the smartest people on the planet.

But when you ask it to reproduce data that it trained on, it complies and produces a near perfect copy, meaning the original is stored inside there using what is effectively an indecipherable compression algorithm.
Remove it from the training data. Fucking retard.
 

Mist

Eeyore Enthusiast
<Trapped in Randomonia>
30,474
22,325
Remove it from the training data. Fucking retard.
That's all anyone is asking, but it's not technically possible the way the technology is built.

If you remove it from the training data it does not remove it from the model, aka the current version of the product. They would have to re-run the training and produce an entirely new model anytime anyone issues a takedown request.

OpenAI's training runs are 6-9 months long, require approximately 20,000 datacenter-class GPUs costing $10k each, and consume enough energy to power tens of millions of homes. All of this is used to distill the training data into a hyper-compressed, indecipherable string of numbers containing the original data and the mathematical relationships between every bit of data inside the model.

You are demonstrating that you know nothing about what you're talking about. Again, there are no special math scissors that let you delete content from the model. If there were, and if content was correctly attributed to the original source, it would be much more like Google and people wouldn't be bitching.
 
Last edited:

Cad

I'm With HER ♀
<Bronze Donator>
24,496
45,437
That's all anyone is asking, but it's not technically possible the way the technology is built.

If you remove it from the training data it does not remove it from the model, aka the current version of the product. They would have to re-run the training and produce an entirely new model anytime anyone issues a takedown request.

OpenAI's training runs are 6-9 months long, require approximately 20,000 datacenter-class GPUs costing $10k each, and consume enough energy to power tens of millions of homes. All of this is used to distill the training data into a hyper-compressed, indecipherable string of numbers containing the original data and the mathematical relationships between every bit of data inside the model.

You are demonstrating that you know nothing about what you're talking about. Again, there are no special math scissors that let you delete content from the model.
Obviously they would have to stop using certain data in the training of new models and stop using models that use the training data if such usage was found to violate copyright. Just because someone files a lawsuit doesn't mean they have a valid claim.

You're still not owning up to trying to pass of an href as the image source while claiming you're a tech expert and we don't "get it."

Got any explanation for that or we just trying to brush that under the rug?
 

Mist

Eeyore Enthusiast
<Trapped in Randomonia>
30,474
22,325
Obviously they would have to stop using certain data in the training of new models and stop using models that use the training data if such usage was found to violate copyright. Just because someone files a lawsuit doesn't mean they have a valid claim.

You're still not owning up to trying to pass of an href as the image source while claiming you're a tech expert and we don't "get it."

Got any explanation for that or we just trying to brush that under the rug?
That's just your lack of reading comprehension, I already covered this in that same post:
So the image you're presented in the index is in fact pulled from the original source. There's some complex caching at the CDN level to speed this up, which is what you see in the img src tag, but the content is pulled from the original source and links to the original source. Further, Google is not presenting this as their own work, unlike an LLM or Diffusion model.
The href shows where the content came from, and links to where you can go find it. It therefore serves as both a reference and an attribution. The img src is a reference to the cache on some CDN somewhere. Content Delivery Networks are absurdly complex and speed up a lot of what we do on the internet every day, but if a content owner wanted their material removed from the CDNs, their engineers could make it happen. The smartest AI developers in the world do not know how to snip content out of a model.

Further, Google does at least try to compensate content producers, even if they do a shitty job of it.


And your first sentence concedes to my entire point here, so you've just admitting to losing.
 

Mist

Eeyore Enthusiast
<Trapped in Randomonia>
30,474
22,325
In an ideal world, every consenting content producer gets their work trained into these models, contributions above a certain percentile threshold get attributed, and everyone gets compensated anytime their content is used.

In the world we're heading for, Woke OpenAI owns the entire internet and gets to pass all of its content off as their own, then sell it back to us. Why Cad Cad do you support one of the Wokest companies on Earth?
 

Cad

I'm With HER ♀
<Bronze Donator>
24,496
45,437
Why Cad Cad do you support one of the Wokest companies on Earth?
I don't, I support intelligence and reasoned arguments that make sense given the laws we have, which is why I usually oppose you.
 

Captain Suave

Caesar si viveret, ad remum dareris.
4,814
8,142
There is no valid legal argument that compressing the original content itself and storing it inside the model, aka OpenAI's product, qualifies as either Fair Use or Transformative Use.

The jury is literally still out on this. To the extent that people making GPTs are re-training and otherwise avoiding copyrighted works it's because they don't want to invest in lawyers until the courts rule. If it were truly as cut and be dry as you imply, Google and OpenAI would have settled already.

It's entirely possible that the courts will decide that training is ok in the public interest, just using specific model products that are obviously similar to copyrighted works for commercial purposes is impermissible, like with human artists. Obviously this is tricky to implement, but perhaps possible by comparing image hashes.

On a practical level, if they make training on copyrighted works illegal then the best models will come from China and people will just use those.
 
Last edited:

Control

Ahn'Qiraj Raider
2,269
5,761
So if I google Dune movie screencap and google does in fact show me movie screencaps, is google also guilty of copyright infringement for having those images in their search engine?
Yes. Just because they've used their giant mountain of money to fend off legal (and cultural) challenges to their business models doesn't mean that said mountain of money wasn't built off of other people's work. (As a bonus, this answer also bypasses the Misting.)
 

Lambourne

Ahn'Qiraj Raider
2,728
6,550
What's in the model and what's not doesn't matter at all. Copyright only exists inside a courthouse and if the court decides whatever you're printing on a t-shirt it looks too much like Mickey Mouse, you get slapped down. Doesn't matter if Mickey really is not in the training model, it won't make a bit of difference.

In related news, Steamboat Willie looks to be finally, finally going out of copyright on January 1st so there's bound to be suits filed by Disney by whoever dares take them on. They're already moving their pieces in place by making all sorts of Steamboat Willie licensed products because it'll help their case in court. Not like today's kids were asking Santa for merch from a black and white cartoon from the 1920s.

2024 is going to be a good year to be a copyright lawyer.

1703837081905.png