znihilist
znihilist t1_j6z78wg wrote
Reply to comment by maxToTheJ in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
That's beside the point, my point is that the MP3 compression comparison doesn't work, so that line of reasoning isn't applicable. Whether one use can excuse another isn't part of the argument.
znihilist t1_j6xcp1i wrote
Reply to comment by SulszBachFramed in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
Good point, but the way I see it these two things look very similar but don't end up being similar in the way we thought or wanted. Compression takes one input and generates an output, the object (the file if you want) is only one thing, an episode of house. We'd argue that both versions are loosely identical, just differ in the underlying presentation (their 0's and 1's are different but they render the same object). Also, that object can't generate another episode of house (that aired a day early), or a none existing episode of house that he takes over the world, or where he's a Muppet. As the diffusion models don't have a copy, then the comparison falls on that particular aspect as none-applicable.
I do think, the infringement aspect is going to end up being by the user and not by the tool. Akin to how just because your TV can play pirated content, we assign the blame on the user and not the manufacturer of the TV. So it may end up being that creating these models is fine, but if you recreate something copyrighted, then that will be on you.
Either way, this is going to be one interesting supreme court decision (because I think it is definitely going there).
znihilist t1_j6xa0o3 wrote
Reply to comment by Ronny_Jotten in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
My point is more to the fact that f(x) doesn't have 3.95 in it anywhere. Because another option would be to write f(x) as -(x-2)(x-3)(x-4)*1/6 -(x-1)(x-3)(x-4)*3.95/2 -(x-1)(x-2)(x-4)*9.05/2 + (x-1)(x-2)(x-3)*16.001/6 this recreates the original points, plug in 1 and you get -(-1)(-2)(-3)*1/6 -(0)(-2)(-3)*3.95/2 -(0)(-1)(-3)*9.05/2 + (0)(-1)(-2)*16.001/6 which is just 1.
This version of f(x) has "memorized" the inputs and is written as a direct function of these inputs, versus x^2 which has nothing in it that is retraced to the original inputs. Both of these functions are able to recreate the original inputs. Although one to infinite precision (RMSE = 0) and the other to an RMSE of ~0.035.
I think intuitively we recognize that these two functions are not the same even beyond their obvious differences (first is a 4th order power function, and the other is a 2nd order power function), either way. Point is, I think "memorize" while applicable in both cases, one stores a copy and the other is able to recreate from scratch, and I believe they do mean different things in their legal implications.
Also, I think it is very interesting the divide on this from a philosophical point of view, and with the genie being out of the bottle, then beside strong societal change and pressure that genie is never going back to the bottle.
znihilist t1_j6x5c0y wrote
Reply to comment by maxToTheJ in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
MP3 can recreate only the original version. They can't recreate other songs that has never been created or thought of. Compression only relates to one input and one output exactly. As such, this comparison falls apart when you apply it to these models.
znihilist t1_j6uz705 wrote
Reply to comment by znihilist in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
If you have a set of pair numbers: (1,1)..(2,3.95)..(3,9.05)..(4, 16.001)..etc These can be fitted with x^2, but x^2 does not contain anywhere the four pairs of numbers, but can recreate them to a certain degree of precision if you try to guess the x values.
Is f(x) = x^2 memorizing the inputs or just able to recreate them because they are in the possible outcome space?
znihilist t1_j6uy7z0 wrote
Reply to comment by HateRedditCantQuitit in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
I think people are using words and disagreeing on conclusions without agreeing first on what is exactly meant by those words.
I am not sure that everyone is using the word "memorize" the same. I think those who use it in the context of defense, are saying that those images are no where to be found in the model itself. It is just a function that takes words as an input and outputs a picture. Is the model memorizing the training data if it can recreate it? I don't know, but my initial intuition tells me there is a difference between memorizing and pattern recreation, even if they aren't easily distinguishable in this particular scenario.
znihilist t1_islbm58 wrote
Almost always this is an issue of sampling. Make sure everything is well represented everywhere.
> And why testing accuracy shouldn’t be higher than training?
There is no law that says this shouldn't happen, but in 99.99% of cases it is a sampling issue. However, sometimes when doing off-time testing, this issue can prop out, and isn't necessarily something that means your model is flawed (in this specific context).
I've had an issue with a model we were working on, and we needed to prove that the model works for different time periods, and we needed to remove the last two month's of data from the training and left them for validation. It turns out that in the last months of data, specific subset of the data was over represented than in the previous months, and it was the "good" data.
znihilist t1_j704b3j wrote
Reply to comment by maxToTheJ in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
>> That's beside the point,
> It does for the comment thread which was about copyright
It doesn't, as this is issue has not been decided by courts or laws yet, and opinion seems to be evenly divided. So this is circular logic.
>> my point is that the MP3 compression comparison doesn't work,
> It does for the part that is actually the point (copyright law).
You mentioned MP3 (compressed versions) as comparable in functionality, and my argument is about how they are not similar in functionality, so the conclusion doesn't follow as they are not comparable in that analysis. Compression not absolving copyright infringement doesn't lead to the same thing being concluded for diffusion models. As you asserted that, you need to show show compression and diffusion follow the same functionality for that comparison to work. That's like if I say that it isn't illegal that I can look at a painting and then go home and have vivid images of that painting therefore diffusion models are not doing any infringement, that would be fallacious and wrong, functionality doesn't follow, the same for MP3 example.