OpenAI has created a classifier to distinguish between text written by humans and content created by various AIs. According to their report, the classifier might be more trustworthy. Their classifier correctly labels 26% of AI-written text (true positives) as “potentially AI-written” while incorrectly classifying 9% of human-written content as AI-written in their evaluations on a “difficult set” of English texts (false positives). Usually, as the input text length increases, the classifier’s accuracy climbs. Compared to the previous classifier, this new one is more reliant on data from more recent AI systems.
In order to get feedback on how useful this classifier is, the researchers are also making it available to the general public. Finally, they plan to provide new methods as they continue to work on detecting text produced by artificial intelligence.
Limitations
There are some severe flaws with the classifier. It should be used in conjunction with other methods for determining the origin of a document rather than as the only factor for making choices.
The classifier must be more precise for short texts (below 1,000 characters). The classifier occasionally misclassifies even longer texts.
On occasion, the classifier will confidently but incorrectly classify human-written content as AI-written.
The classifier should only be used for English text, according to the researchers. It cannot be trusted with code and performs far worse in other languages.
Highly predictable text is difficult to recognise. For instance, it is impossible to anticipate whether humans or AI created a list of the first 1,000 prime numbers because the right response is always the same.
Text produced by AI can be changed to avoid classification. Although it’s currently unclear whether detection has a long-term advantage, this Classifier can be updated and retrained in response to successful attacks.
The need for better calibration of neural network-based classifiers outside of their training data is well established. The classifier is occasionally very confident in making an inaccurate prediction for inputs that are considerably different from the text in its training set.
Conclusion
The researchers agree that detecting literature generated by artificial intelligence has been a crucial subject of debate among academics. It’s equally crucial to comprehend the limitations and implications of AI-generated text classifiers in the classroom. They have developed a key resource on using ChatGPT for educators that outlines some of the benefits, restrictions, and problems. In addition to educators, the researchers believe that their classifier and related tools will have an effect on journalists, misinformation researchers, and other organisations.
The researchers converse with American teachers to learn what they see in the classrooms and investigate the advantages and disadvantages of ChatGPT. They will keep growing our reach as they learn more. These discussions are essential to achieving their goal of deploying large language models securely in close contact with impacted populations.