Submitted by Tooskee t3_10oif8i in technology
Ronny_Jotten t1_j6i3uog wrote
Reply to comment by CallFromMargin in Microsoft, GitHub, and OpenAI ask court to throw out AI copyright lawsuit by Tooskee
I don't know what paper you're referring to, but there's this one:
Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models
It clearly shows, at the top of the first page, the full Stable Diffusion model, trained on billions of LAION images, replicating images that are clearly "substantially similar" copyright violations of its training data. The paper cites several other papers regarding the ability of large models to memorize their inputs.
It may be possible to tweak the generation algorithm to no longer output such similar images, but it's clear that they are still present in the trained model network.
Mr_ToDo t1_j6j481z wrote
Well, they did both in that paper. But it would be interesting to know what the ones at the top were from. I know that there's one I saw further down in high hit percents further down but with as nice as they are I don't know why the rest don't if they belong to that model.
Ronny_Jotten t1_j6kjrlv wrote
The paper explains what the ones at the top were from. It's using Stable Diffusion 1.4. See page 7: Case Study: Stable Diffusion, page 14: C. Stable Diffusion settings, and page 15 for the prompts and match captions. Sorry, the rest of your comment is incomprehensible to me...
Mr_ToDo t1_j6mwtay wrote
OK that's on me. I hit the references and somehow thought I was done with the paper, I didn't think they would have the captions they used underneath that. I admit that was on my bad due diligence. Apologies
Viewing a single comment thread. View all comments