Submitted by MrSpotgold t3_10df5tm in MachineLearning
[removed]
Submitted by MrSpotgold t3_10df5tm in MachineLearning
[removed]
The digital watermark though risks damaging the model outputs, and would rendered useless when changing generated the text output yourself.
I guess we don't know how they'll do it yet, but from what I understand, the purpose is to prevent future gpt versions to train on gpt generated text because gpt trains on text from the Internet.
As if that won’t be easy to bypass
Idk, I guess the point is that if text is 100% gpt written and not reviewed by a human, then there is a risk that gpt learns from bad gpt examples. If you review and modify it to remove the watermark, then it is effectively human reviewed/labelled content and ok for re-ingestion in future iterations.
But tbh the guys at openai are pretty capable, I'm sure they'll think of something. I don't know anything more than the headline I read.
> digital watermark
Wouldn't it be easier to store the model outputs or a perceptual hash, and then provide a way to determine if some text is similar to prior ChatGPT output? I assumed they were already doing something like this to collect usage data as they scrape new content.
ChatGPT already has a unique writing style, I'm not sure how you could add anything to the text which couldn't be trivially removed and do better
Not really, I tried Chat GPT a few days ago. Thus I gave it a theme in which I had written an Essay before and asked it to rewrite it. I sent both texts to my father, who knows my writing style, and he was unable to differentiate who wrote which one. To be fair, you can tell the AI to give you a whole paragraph in other words, which often improves the language.
It shouldn't be too difficult to produce a watermark provided the output is something on the order of a paragraph. However, I don't think its always possible. For instance if I ask ChatGPT to replicate the previous paragraph by replacing all nouns and verbs and to keep the same meaning.
Further tweaking by a human should completely destroy any residual.
I’m really curious how that would work. It seems very constraining to watermark text. Any existing solutions? For audio and pictures it seems pretty straightforward but for text?
[deleted]
If you can detect if something is written by chatgpt then you can fine tune a model to adapt the text and avoid detection.
It could be as simple as storing everything chatGPT creates into a searchable database to detect if it was created by the AI.
That seems like an easy solution, even if they just stored it for a few weeks.
Could be done for chatGPT but once an open source version is available this won't be possible.
If the whole motivation here is to detect the cheating student, most cheating students won’t simply copy and paste but will spend at least 5 - 15 min making modifications and writing some in their own language.
Policing cheating beyond punishing those who obviously are cheating in the worst ways is not as important as one might think. Cheating on highly competitive graduate school entrance exams is something to strictly police, but not an English writing assignment or a math word problem. It sounds corny, but the student in those cases really is just cheating themselves.
Any professor who talks to you in person or in class discussions or office hours, sees how you interact in group projects, and any employer who works with you on complex real-world problems chatGPT can’t solve will know very quickly that you (the cheater) don’t have a firm grasp of the material, prerequisites, or know-how to apply it, and the student or employee that does have that know-how and understanding gets the promotion, better Reference, has the better grades still (on average over all classes), and will interview much better for jobs and can speak intelligently about what they accomplished and solve problems on the spot on a white board, while the cheater fumbles and cannot pull out chatGPT during the interview, lol.
[deleted]
It is easy to train a model to rewrite text in the style of other texts.
No and no
To expand on this, no you can't expect a model to perform a task it's not trained for, and no chatgpt should not be trained to recognise ai generated output, that's not what the architecture is good for.
[deleted]
[deleted]
Doubt it.
Even if it does, that doesn't mean it has a search function.
Yeah well, that's not really how these models work. There's no pulling from a database and there's no external searching. The model was trained and frozen.
While it is possible to have the model access some external database in the future, yeah...that's not going to happen in relation to previous chat entries you have no right or access to. That's a privacy can of worms no corporation with any sense will get into as well as being prohibitively expensive for no real gain at all.
OpenAI stores the chat logs. That does not mean ChatGPT has any way to search through them.
[deleted]
There could be a separate database and algorithm to detect this if they wanted to, but this wasn’t a goal of chatGPT.
You wouldn’t need an AI/ML to do this, and also note it isn’t 100% impossible for a human to respond identically to chatGPT’s response, especially for shortest length responses, without knowing chatGPT would respond the same way.
Why do you “need” this? Just curious.
This software is a nightmare for anyone in the teaching business (whether secondary school or higher education) where assessments is based on essays. I'm not kidding: a nightmare. We are going to bring up kids who will not be able write a comprehensive text simply because we lack the means to check that they wrote it themselves, and therefore we must abandon the assessment method altogether. It's that bad.
No it’s not.
Anyone who wanted to cheat on a take-home essay or assignment always could, and anyone who has to write an essay in-class monitored for more critical and competitive standardized tests cannot be pulling out their devices and typing into chatGPT, which doesn’t write A+ essays that a teacher can’t detect are “a little off” anyway.
As a former educator myself, I always knew which students had mastered the material and could intelligently talk about it in class discussions, during office hours, and through in-class essays/quizzes where they could not cheat while I closely monitored. They couldn’t get an A+ by simply cheating on a few of the take-home essays, and the typical cheaters are cheating just to get by and still end up with inferior grades to those who master the subject.
Furthermore, concentrating too much on catching cheaters takes away from time you could be spending enriching the learning experience of everyone else.
It also sounds corny but is true: When you cheat, you’re only cheating yourself. Cheating really is self-policing in many instances. When we interview candidates who have a degree and a high GPA, it’s very obvious of they just got good grades but are clueless and we don’t hire them. It might be cheating, or maybe grade inflation, or perhaps just short-term memorizing but not actually retaining or understanding what they were learning, but it’s night and day.
Those who truly care to learn will excel in their jobs and get better promotions. ChatGPT isn’t going to help you there.
Having said that, I would consider possibly modifying the curriculum you only give take-home work that is 90% of the grade and can, but it’s not worth stressing over. Put your effort into teaching and enriching the lives of those who want to learn and yearn for knowledge. You’re an educator first, and police work is just a side gig you can’t ignore, but isn’t your main purpose.
ChatGPT is beyond cheating. We have to go on the default that it is applied in essay writing. And surely you agree that it is pointless to assess the output of a machine. Therefore, essay writing will cease to be a method of assessment, and consequently, whichever way you look at it, future students will no longer learn to write.
First, I’m blown away that you are suggesting that you don’t know your students and their writing styles, some of which are performed in-class and almost all of which differ significantly from the way ChatGPT writes, but second, my teachers said the exact same thing you are saying decades ago and freaked out when CliffsNotes came out!
Re-read my prior argument because nothing you just said impacts it, and it still stands.
I appreciate your comments. We don't have to agree. Moreover, I could be wrong.
I've tried that, but no joy.
As it's designed to emulate different styles of writing I doubt it's a "coming soon" thing.
[deleted]
dmart89 t1_j4l4vyz wrote
Right now, no. They're working on a digital watermark for model outputs to distinguish whether gpt wrote something or a human.