Computer language models have mistakenly identified the US Constitution, drafted in 1787 before the advent of the internet, as a document created by artificial intelligence (AI). This phenomenon occurs because the Constitution has been repeatedly fed into the training data of large language models. Edward Tian, creator of AI writing catcher GPTZero, explains that these models generate text similar to frequently used training texts, including the Constitution. The inability of AI to distinguish between computer-generated and human-written text raises concerns about digital plagiarism. Professors at Texas A&M and other universities have encountered issues with language models claiming responsibility for student assignments, leading to failed grades.
GPTZero, which analyzes human-written and AI-generated text, relies on indicators like “perplexity” and “burstiness” to detect human touch in writing. Perplexity measures how surprising language is based on what the model has seen before, while burstiness gauges the consistency of words and phrases in a writing sample. However, recent research from the University of Maryland questions the reliability of these methods in practical scenarios.
Wharton professor Ethan Mollick advocates for embracing AI in education and acknowledges the difficulty in reliably detecting AI-generated writing. He states that existing tools are trained on outdated models and have high false positive rates. Mollick suggests adapting to AI involvement in education rather than attempting to detect it.
Creators of AI models, like Tian, have taken note of these challenges. GPTZero is being modified to focus less on detecting AI-generated text and more on identifying what is most human. The aim is to assist teachers and students in navigating the role of AI in education rather than catching students who engage in plagiarism. The issue with the Constitution being mistakenly identified as AI-generated has already been resolved in GPTZero.