Submitted by YoutubeStruggle t3_10ofcis in MachineLearning

Have you tried ChatGPT? It's super cool but some users are also using it to create automated content submissions and resulting in an increase in AI-generated plagiarism. I have made a tool as a college project to detect content generated using AI.
Go ahead and validate your content on AI Content Detector
If you are an educator worried about automated content submissions or developers worried about search engine penalties, this tool will help everyone to efficiently detect content generated using AI.

0

Comments

You must log in or register to comment.

mkzoucha t1_j6ealr7 wrote

Too many false positives and ways to trick it, this has been proven over and over again

9

YoutubeStruggle OP t1_j6ebgjv wrote

Did you give it a try? It shouldn't be easy to fool this tool. Can you give an example of when it gives a false positive? Your feedback is appreciated.

−11

mkzoucha t1_j6ed1z9 wrote

I did not have time to try this specific one but I have tried at least 10 others. Sorry, not trying to be negative or anything. They’re are just tons of different models, each of which would need a separate detection model. The model was trained on human writing, so it’s bound to have humanistic sound, and some humans are bound to have a writing voice similar to the output of AI content creators. There is also no real standard ‘human’ way of writing to clearly separate the two. Combine that with the difference in results based on the prompt and it quickly becomes an insurmountable task in my opinion.

At the end of the day, I applaud your efforts, truly but realistically I think your model is significantly overfit to a very small percentage of possible samples, both AI and human generated.

9

YoutubeStruggle OP t1_j6efuzb wrote

I agree, but the point is AI, and e.g. chatGPT, will always have one way to generate content. Whereas humans may have diverse ways of writing and suppose if we consider an essay or an article, the way of writing by a human would vary with every single sentence but it would remain the same for AI throughout. That's how AI-generated content can be detected. If we do para-wise analysis, we would get better results and a clearer picture but it won't be the same for sentence-wise analysis. And there should not be any possible way that for a particular human, all the generated paragraphs come out to be detected as AI-generated.

−10

mkzoucha t1_j6edujv wrote

Also, one more thing, at the end of the day there is no way to prove either way without having students record their screens and entire rooms (or only do in person) when writing papers

3

YoutubeStruggle OP t1_j6eh9z2 wrote

AI can generate text that resembles human writing, but it is still not capable of truly replicating the depth and nuance of human writing. AI text generation models can generate text that is coherent and grammatically correct, but it lacks the personal touch, creativity, and emotional depth that is unique to human writing. This is because AI is trained on large amounts of data and generates text based on statistical patterns in the data, whereas human writing is influenced by personal experiences, emotions, and individual perspectives. Additionally, AI text generation models may still struggle with context-awareness and understanding the full meaning behind the words it is generating. So, AI-generated content can often be distinguished from human-written content by its lack of originality and personal touch.

That's what chatgpt thinks about writing text resembling human-generated content :)

−3

mkzoucha t1_j6eied9 wrote

What are your training sample sizes? What about test? How was your data compiled? Labeled? What ai models? What were the sources of human writing?

2

YoutubeStruggle OP t1_j6ekdz9 wrote

My total data size is 40K paragraphs, where I have used Roberta-base and chatGPT was used for ai-generated sentences.

0

royalemate357 t1_j6eq454 wrote

I tried it with chatgpt, and it correctly identified the text as ai generated when i used the output exactly. but then when i changed the capitalization of the first letter in the sentence and removed a few commas, it changed to human generated (84%). it seems to me its kind of a superficial detector, and is quite easy to fool. also, what is the false positive rate? if this tool or others are used to flag students for plagiarism, it had better be pretty close to zero.

4

JaCraig t1_j6fws2x wrote

Just adding on that I used ChatGPT and adding any sort of "write it in the style of X" to the end fools it. Tell it to do some run on sentences, etc. same thing.

1

MrEloi t1_j6ebcks wrote

Students should declare use of AI tools.

Educators should accept - ideally encourage - AI tool use.

3

YoutubeStruggle OP t1_j6ec8vq wrote

The use of AI tools should definitely be appreciated. It is saving a lot of time and as a fellow developer, I would highly encourage it. But the classification of human-generated content is necessary as AI-generated content could be misleading, making it important to distinguish it from human-generated content. Also detecting AI-generated content can help ensure the quality of information being shared and consumed, especially in sensitive domains such as news and medicine.

1

[deleted] t1_j6eehil wrote

[deleted]

1

YoutubeStruggle OP t1_j6ehkfl wrote

ChatGPT supremacy ;)

1

MrEloi t1_j6ei3ih wrote

The whole of the web will soon be just AI generated content.

3

YoutubeStruggle OP t1_j6eiymv wrote

TBH, that sounds scary. AI will make life much faster and increase the productivity of every individual. But I believe various sensitive domains where the quality of information is the want, human content will reside.

1

gunshoes t1_j6fyskw wrote

Eh, depends on context. People forget that all the things that go into writing (drafting, rewriting, sounding words out to make sure they articulate what you mean), is a pedagogical act in itself. Assignments aren't supposed to be busy work, they're additional opportunities for learning in which students have to evaluate their own writing strategies. Using AI tools removes that element of metacognition and reduces assignments to just prompt tuning. If you're just filling out reports and are suffering writer's block, sure, why not. But other cases the writing process is the lesson.

1

Acceptable-Cress-374 t1_j6haw2s wrote

I tested this with text-davinci-03.

Prompt: Write a post about liking fruity icecream, in the style of a reddit post. Make some minor grammar mistakes while writing the post.

> hey everyone, so I'm totally loving this fruity icecream lately! I got it from a store near my house and its been so good. Its really sweet but not overly sickeningly sweet, like some other ice cream. Plus the fact that its fruity makes it even better in my opinion. Does anyone else like fruity icecream too? It'd be awesome to hear some of your thoughts on it!

This site gave me this:

85 %

HUMAN-GENERATED CONTENT

3

yahma t1_j6evm31 wrote

The problem with many of these AI Content detectors is they too often flag human written text as AI Generated.

1

Dry-Tomatillo449 t1_j6htvqh wrote

An AI Content Detector is a type of artificial intelligence software that is used to detect and analyze content from various sources such as images, audio, or video. It can be used to identify objects in images, detect text in audio and video recordings, or find relevant topics in documents. AI Content Detectors can be used to automate tasks such as content curation, content filtering, and content recommendation. It can be used to make decisions about what content should be included in a website, blog, or other online material. Additionally, AI Content Detectors can be used to identify and classify images, audio, video, and text in order to better understand the content and provide more relevant results.

1