Submitted by Tlaloc-Es t3_10z1jxz in MachineLearning
goj-145 t1_j80xlao wrote
Reply to comment by Tlaloc-Es in [D] Is it legal to use images or videos with copyright to train a model? by Tlaloc-Es
Not really hard when the model is spitting out watermarked images.
Miguel33Angel t1_j830cig wrote
He's asking in the case of a predictor i.e. ResNet or other models that just categorizes
goj-145 t1_j831dqg wrote
The question is can you use copyrighted info to train a model. The answer is we don't know yet.
The current lawsuit that will define precedent on this is for image generation using copyrighted Getty images in a training model. It's proven that Getty images are used because the watermark shows up in the output of the model many times which is the answer to "how can they prove it".
Once that is defined, then we will know if it is legal or not in those jurisdictions. And then we will get to the "do we do it anyways even though it's illegal?"
2blazen t1_j8378vr wrote
So you're saying Stability wouldn't have issues if they hired an intern to git clone a watermark remover and put the images through it first?
goj-145 t1_j83801h wrote
It would have been MUCH harder to prove if they spent a day preprocessing the images first!
currentscurrents t1_j85rpol wrote
They use the open LAION 50B dataset, everybody knows what's in there.
Still, some preprocessing and deduplication would have been a good idea just for output quality.
Ulfgardleo t1_j84fdfl wrote
if it is illegal now it would be super illegal then, because removing watermarks on its own typically violates the license of the material.
​
The question is 100% the same as "can i include GPLv3 code in my commercial closed source repository if i remove the license headers and ensure that the code ris never published?"
Viewing a single comment thread. View all comments