Submitted by TiredOldCrow t3_y7mwmw in MachineLearning
GPT-3 and the multitude of similar models that have come out over the last couple years likely represent a serious threat to scientific conferences. What can we do about it?
Some historical context: computer-generated papers have been showing up in major publications since a model called SCIgen released in 2005.
SCIgen uses a simple context-free grammar to produce templatized papers that are basically pseudo-scientific gibberish. People are still finding those papers many years later. These papers are generally churned out to inflate citation statistics, or by well-meaning researchers probing suspected low publication standards at existing conferences (which I generally don’t recommend, since it adds to the mountains of paper that we have to churn through as reviewers).
There’s a negligible chance that existing review processes will fare any better on 175B parameter generative Transformers.
Work has already been published that uses GPT-2 to “assist” in scientific writing. While using such a model while writing is not necessarily academically dishonest – such tools nevertheless greatly increase the ease of churning out fake papers.
Even if major conferences find creative solutions to this, the next rung of venues below them are likely to learn about these threat models the hard way. After all -- to an unethical researchers citing their own work at the bottom of a machine generated paper -- a citation is a citation, even in a low quality publication.
--
Open Questions
Threat severity: How serious is this threat? Is this really an old problem, and new generative models won’t make a huge difference?
Improving peer review: Does this just come back to reproducibility? Should we be reviewing certain papers by their code, rather than just their text? Do we need a way to weight the value of research citations by the quality of the work doing the citing?
Acceptability of machine text: How do you decide when machine generated text is unacceptable versus acceptable? Will detection models for machine generated text end up creating algorithmic biases targeting people who speak English as a second language and rely on translation models or writing assistants to help them write?
Legitimate AI-generated papers? Are there papers with real scientific value that could be entirely written by algorithms? E.g., could survey papers one day be produced almost entirely by a specialized model?
Defenses: What are some technical or social solutions that can help defend against this type of abuse?
--
This is Part 1 of a planned Tuesday discussion series on threat models, based on the things keeping us up at night since a recent survey paper focusing on the threat models of machine generated text.
countably_infinite_ t1_iswvh51 wrote
A paper has merit based on the the academic contribution. Being clear and precise, easily understandable, makes it a better paper. If you manage to prompt a LLM so that the result excels in these criteria it should be accepted.
With the attitude some students and vanity-auithors have there is already an incentive to produce pseudo scientific mumbo-jumbo and see if you can get away with it. I mostly see a problem here in scalability, i.e. the pure mountain of noise that can be generated. Similar dangers are valid for journalism and democratic discourse in general imho.
One already decently working mitigation strategy is reputation (for authors, groups, conferences, ...) but of course this comes at the risk of overlooking some brilliant work that is coming from left field.