Featureless_Bug t1_j1uygsw wrote on December 27, 2022 at 4:27 PM

Why should they do it again?

gkaykck t1_j1uzq8f wrote on December 27, 2022 at 4:36 PM

Personally, I'd like to be able to filter out AI generated content from my feeds sometimes.

Featureless_Bug t1_j1v4w14 wrote on December 27, 2022 at 5:10 PM

Sure, some users might be interested in it. Why would OpenAI do it, though? Especially given a wide range of open source alternatives that you can run on your own cluster

gkaykck t1_j1v72ay wrote on December 27, 2022 at 5:25 PM

I think if this is going to be implemented, it has to be at model level, not as an extra layer on top. Just thinking outloud with my not so great ML knowledge, if we mark every image in training data with some special and static "noise" which is unnoticable to human eyes, all the images generated will be marked with the same "noise". So this is for running open source alternatives on your own cluster. I think if this kind of "watermarking" will be implemented, it needs to be done in the model itself.

When it comes to "why would OpenAI do it", it would be nice for them to be able to track where does their generated pictures/content end up to for investors etc. This can also help them "license" the images generated with their models instead of charging per run.

Exnur0 OP t1_j1vgj6f wrote on December 27, 2022 at 6:26 PM

You don't actually have to watermark images in order to know that you generated them, at least not if you're checking exactly the same image - you can just hash the image, or store a low-dimension representation of it as a fingerprint (people sometimes use color histograms, in principle you could use anything). Then, you can look up images against that data to see if it's one of the ones you produced.

Brudaks t1_j21x8ut wrote on December 29, 2022 at 1:39 AM

Thing is, we can't really do that for text, natural language doesn't no free variation where you could insert a sufficient bits of special noise unnoticable to human eyes. Well, you might add some data with various formatting or unicode trickery, but that would be trivially removable by anyone who cared.

Featureless_Bug t1_j1veefz wrote on December 27, 2022 at 6:12 PM

>I think if this is going to be implemented, it has to be at model level, not as an extra layer on top. Just thinking outloud with my not so great ML knowledge, if we mark every image in training data with some special and static "noise" which is unnoticable to human eyes, all the images generated will be marked with the same "noise".

This is already wrong - it might work, it might not work

>So this is for running open source alternatives on your own cluster.

Well, of course the open source models will be trained on data without any noise added, people are not stupid

>When it comes to "why would OpenAI do it", it would be nice for them to be able to track where does their generated pictures/content end up to for investors etc. This can also help them "license" the images generated with their models instead of charging per run.

Well, open AI won't do it because no one wants watermarked images. Consequently, if they tried to watermark their outputs, people will be even more likely to use open-source alternatives. That's why open AI won't do it

Eggy-Toast t1_j1vhtgg wrote on December 27, 2022 at 6:34 PM

“This is already wrong — it might work” disingenuous much?

The point of that proposed watermark is that it can be imperceptible to the human eye but perceptible by some algorithm or model nevertheless. It only adds value to the product, but perhaps not as much as it would take to implement.

I think in your comments though you entirely overlooked the fact that DALLE 2 has watermark implementation and it is in no way subtle, but it can be cropped out.

perta1234 t1_j1w1872 wrote on December 27, 2022 at 8:44 PM

My question too. If a simple request is enough to get the work done, the work was too easy. The standard outputs are quite boring and low quality, but with good additional criteria and limits and requests, the output begins to become interesting. But at that point it is more like cocreation. In my testing, I found that I can save about 30% of the time compared to writing it myself, not more. Since the AI tends to hallucinate when subject is challenging, you need to direct the AI carefully.The main benefit coming from the use of AI is that one needs to think and describe the requested text outline and focus and aim very well. After that is done, writing is very easy anyway. AI just writes better English than some of us.