Submitted by TiredOldCrow t3_y7mwmw in MachineLearning
GPT-3 and the multitude of similar models that have come out over the last couple years likely represent a serious threat to scientific conferences. What can we do about it?
Some historical context: computer-generated papers have been showing up in major publications since a model called SCIgen released in 2005.
SCIgen uses a simple context-free grammar to produce templatized papers that are basically pseudo-scientific gibberish. People are still finding those papers many years later. These papers are generally churned out to inflate citation statistics, or by well-meaning researchers probing suspected low publication standards at existing conferences (which I generally don’t recommend, since it adds to the mountains of paper that we have to churn through as reviewers).
There’s a negligible chance that existing review processes will fare any better on 175B parameter generative Transformers.
Work has already been published that uses GPT-2 to “assist” in scientific writing. While using such a model while writing is not necessarily academically dishonest – such tools nevertheless greatly increase the ease of churning out fake papers.
Even if major conferences find creative solutions to this, the next rung of venues below them are likely to learn about these threat models the hard way. After all -- to an unethical researchers citing their own work at the bottom of a machine generated paper -- a citation is a citation, even in a low quality publication.
--
Open Questions
Threat severity: How serious is this threat? Is this really an old problem, and new generative models won’t make a huge difference?
Improving peer review: Does this just come back to reproducibility? Should we be reviewing certain papers by their code, rather than just their text? Do we need a way to weight the value of research citations by the quality of the work doing the citing?
Acceptability of machine text: How do you decide when machine generated text is unacceptable versus acceptable? Will detection models for machine generated text end up creating algorithmic biases targeting people who speak English as a second language and rely on translation models or writing assistants to help them write?
Legitimate AI-generated papers? Are there papers with real scientific value that could be entirely written by algorithms? E.g., could survey papers one day be produced almost entirely by a specialized model?
Defenses: What are some technical or social solutions that can help defend against this type of abuse?
--
This is Part 1 of a planned Tuesday discussion series on threat models, based on the things keeping us up at night since a recent survey paper focusing on the threat models of machine generated text.
TiredOldCrow OP t1_isvei0e wrote
My own thinking is that large venues might implement automated filtering using detection models. Detection of machine generated text increases with sequence length, so papers with large amounts of generated text stand reasonable odds of detection.
That said, the results of this detection would likely need to be scrutinized by a reviewer anyways (especially if conferences don’t ban AI writing assistants altogether).