Viewing a single comment thread. View all comments

adt t1_j9nv4zj wrote

>shouldn't the onus of delineating man from machine be on the side providing the AI chatbot?

It is.

Here's a very long read, but it will explain how OpenAI is building in watermarking for use by govt + themselves + maybe academia.

https://scottaaronson.blog/?p=6823

>'to watermark, instead of selecting the next token randomly, the idea will be to select it pseudorandomly, using a cryptographic pseudorandom function, whose key is known only to OpenAI. That won’t make any detectable difference to the end user, assuming the end user can’t distinguish the pseudorandom numbers from truly random ones. But now you can choose a pseudorandom function that secretly biases a certain score—a sum over a certain function g evaluated at each n-gram (sequence of n consecutive tokens), for some small n—which score you can also compute if you know the key for this pseudorandom function'

And why they wouldn't just stick it in a database of logs:

>'Some might wonder: if OpenAI controls the server, then why go to all the trouble to watermark? Why not just store all of GPT’s outputs in a giant database, and then consult the database later if you want to know whether something came from GPT? Well, the latter could be done, and might even have to be done in high-stakes cases involving law enforcement or whatever. But it would raise some serious privacy concerns: how do you reveal whether GPT did or didn’t generate a given candidate text, without potentially revealing how other people have been using GPT? The database approach also has difficulties in distinguishing text that GPT uniquely generated, from text that it generated simply because it has very high probability (e.g., a list of the first hundred prime numbers).'

7

RaccoonProcedureCall t1_j9o1gl1 wrote

Forgive me for not reading the entire post you linked, but is the plan that this watermarking would not be detectable by the general public out of concerns for “privacy”? Also, has this been implemented with ChatGPT (or do we know)?

Also, it surprises me that someone from OpenAI would acknowledge the shortcomings of their current measures for identifying AI-generated content.

2

wbsgrepit t1_j9qyscw wrote

The problem is these types of watermarks where the model layers are tweaked with a key to bend the output are easily obliterated by double dipping -- Chatgpt to generate then another paraphrase llm to rewrite. text canaries are brittle af.

2

RaccoonProcedureCall t1_j9rcsir wrote

Yeah, and I believe the author of that blog post acknowledges as much. I suppose being able to detect some text is better than being able to detect no text. Maybe that’s why watermarking is being pursued, but I can hardly speak for that author or for OpenAI.

1