Exnur0 OP t1_j1v2zrb wrote on December 27, 2022 at 4:58 PM

Reply to comment by hjmb in [Discussion] 2 discrimination mechanisms that should be provided with powerful generative models e.g. ChatGPT or DALL-E by Exnur0

Thanks for the insightful comment, I think you helped me get out the rest of my thought.

I definitely agree, both of these have gaping holes in them if anyone with expertise comes along. The second mechanism is meant to plug holes in the first, but people can definitely construct media to get past the second anyway. Ideally it would return a value 0-1 and inform the human's level of suspicion, and not just a boolean, but still, it's likely that certain tricks would end up helping someone get things past it.

I think these are useful mostly because of what they may be able to accomplish with regard to the scale of the problem - any additional effort required to pass off AI work as human is a good thing, as far as I'm concerned, and some of the scariest implications of these kinds of models come from their scale - moderation is more or less impossible, if you have to deal with limitless examples of generated content.

For example, take the problem of misinformation, like what happened on StackOverflow (GPT-generated answers were banned, largely because they're often wrong, is my understanding). Imagine that StackOverflow had access to an API that could reliably point out unedited (or close to it) generated content. In that case, the scope of the problem shrinks to only those people willing to put in the effort to slip things past the discriminator, which hopefully will be much smaller of a set.

I also definitely agree that there are other problems that aren't solved by discrimination at all, even if discrimination was perfect - really, the underlying point is that the labs cranking out powerful generative models could be doing much, much more in terms of accompanying tooling, to try to decrease the negative impacts of their tech. I don't see what I'm describing as bulletproof or as always useful, but it strikes me as kind of a bare-minimum precaution. If nothing else, I should be able to put completely unedited generated media into an API and get back an answer that it was generated.