Online content farms have long been a source of controversy in the media industry. Recent reports, however, shed light on a new trend that has rocked the journalism world: AI chatbots being used to plagiarize news articles from reputable sources like The New York Times and republish them on various websites. These sites, numbering at least 37 according to NewsGuard, have taken content replication to another level by scrambling and rewriting entire news stories without proper credit.
What sets these AI-powered content farms apart from their predecessors is the use of chatbots, allowing them to create articles that are nearly indistinguishable from authentic ones. In the past, content mills would merely instruct AI models to generate stories from scratch, resulting in poor-quality writing. But with chatbots relying on existing articles as source material, the quality of the plagiarized content has significantly improved.
NewsGuard, a misinformation monitor, identified these content farms by noticing common chatbot error messages on their sites. However, they acknowledge that the actual number of sites engaging in this practice could be much higher. Many sites have managed to fly under the radar by promptly removing these error messages, making it difficult to detect their plagiarism.
These content farms cover a wide range of topics, including science, politics, sports, and breaking news. Some of the identified websites include DailyHeadliner.com and TalkGlitz.com. Ironically, one site called WhatsNew2Day.com even wrote an article about AI based on an article from The Verge discussing ads against AI-generated news stories.
Aside from the ethical concerns surrounding plagiarism, advertisers unknowingly contribute to the financial success of these AI copycat sites. NewsGuard found programmatic ads from 55 well-known brands running on 15 of the analyzed sites. Due to the opaque nature of programmatic advertising, these brands remain unaware that their ads are funding AI plagiarism.
While it remains unclear which specific AI models are being used, both Google and OpenAI have policies against plagiarism and the misrepresentation of generated content. Unfortunately, these policies currently appear to be mere suggestions, as neither company has responded to inquiries regarding this issue.
The rise of AI content farms has opened up a larger conversation about the future of the news industry. Traditional publishers are facing the challenge of incorporating AI into their newsrooms while maintaining journalistic integrity. Some publications have already been caught using AI to generate articles without proper disclosure, while others are exploring AI tools for ideation and interview preparation. Nevertheless, skepticism remains, as AI models are prone to hallucinating facts and are often trained on copyrighted material, posing a threat to ethical journalism.
As the industry grapples with these ethical and legal concerns, it comes as no surprise that The New York Times has taken a stand against AI companies by explicitly forbidding them from using its archives for training AI systems. The Times may even take legal action against OpenAI over the issue of data scraping. The battle between AI and ethical journalism wages on, leaving both publishers and the public questioning the role of AI in news production and consumption.