respeckKnuckles t1_j1v440q wrote on December 27, 2022 at 5:05 PM

I'm not sure how the side by side comparison answers the same research question. If they are told one is AI and the other isn't, the reasoning they use will be different. It's not so much "is this AI?" as it is "which is more AI-like?"

dojoteef t1_j1v4j4r wrote on December 27, 2022 at 5:08 PM

You don't need to tell them one is AI or model generated. Could be two model generated texts or two human written texts. Merely having another text for comparison allows people to better frame the task since otherwise they essentially need to imagine a baseline for comparison, which people rarely do.

respeckKnuckles t1_j1v66iq wrote on December 27, 2022 at 5:19 PM

You say it allows them to "better frame the task", but is your goal to have them maximize their accuracy, or to capture how well they can distinguish AI from human text in real-world conditions? If the latter, then this establishing of a "baseline" leads to a task with questionable ecological validity.

Ulfgardleo t1_j1vcqri wrote on December 27, 2022 at 6:01 PM

you are asking humans to solve this task untrained, which is not the same as the human ability to distinguish the two.
you are then also making it harder by phrasing the task in a way that makes it difficult for the human brain to solve it.

respeckKnuckles t1_j1vempm wrote on December 27, 2022 at 6:14 PM

> you are asking humans to solve this task untrained, which is not the same as the human ability to distinguish the two.

This is exactly my point. There are two different research questions being addressed by the two different methods. One needs to be aware of which they're addressing.

> you are then also making it harder by phrasing the task in a way that makes it difficult for the human brain to solve it.

In studying human reasoning, sometimes this is exactly what you want. In fact, for some work in studying Type 1 vs. Type 2 reasoning, we actually make the task harder (e.g. by adding WM or attentional constraints) in order to elicit certain types of reasoning. You want to see how they will perform in conditions where they're not given help. Not every study is about how to maximize human performance. Again, you need to be aware of what your study design is actually meant to do.

Ulfgardleo t1_j1vjc6q wrote on December 27, 2022 at 6:44 PM

I don't think this is one of those cases. The question we want to answer is whether texts are good enough that humans will not pick up on it. Making the task as hard as possible for humans is not indicative of real world performance once people get presented these texts more regularly.

respeckKnuckles t1_j1vl5dv wrote on December 27, 2022 at 6:56 PM

Hence my original question to OP.

londons_explorer t1_j1vecdr wrote on December 27, 2022 at 6:12 PM

You could get a similar outcome by discarding results of the first 2 or so examples of each session as 'practice' ones, then recording data from the rest.

[deleted] t1_j1w050f wrote on December 27, 2022 at 8:37 PM

[deleted]

[P] Can you distinguish AI-generated content from real art or literature? I made a little test!

dojoteef t1_j1uwubj wrote on December 27, 2022 at 4:16 PM