The AI Feedback Loop – Should We Be Concerned?
Up until the last few years, most, but not all, of the content found on the internet was the product of direct human creation. Original Images were either photographs (digital or scanned images from paper photos) or created using digital drawing software. Video footage was usually captured by cameras or other recording devices. Stories, poems, and news articles were credited to the work of creative writers, journalists, and authors. Together, this core group of media made us the basis of what could be found on the world wide web.
As time passed, the world watched while users used Photoshop, Corel, and others to alter original photos. In the early stages, these changes were good-natured. Memes became commonplace as a source of light entertainment. Video deepfakes and other forms of media manipulation soon grew in sophistication, becoming nearly impossible to discern from an original.
From It's Early Roots Until Today - The Rise of AI
To some, the rise of AI seemed like a natural evolution of digital tools. However, few realize that AI has existed in some form since the 1950s, long before today’s deep learning models. These early systems were primarily rule-based systems that mimicked human reasoning in limited, structured domains. Noteworthy programs such as ELIZA (1966), an early language processing model and MYCIN (1972-1977) which was developed to diagnose bacterial infections and recommend antibiotics, were based on manually encoded knowledge to reach certain conclusions.
The recent emergence of powerful machine learning and deep learning systems we use today, such as GPT, Midjourney, AlphaGo, and Microsoft Copilot. These modern AI programs work by using computers to recognize patterns, learn from data, and make decisions or predictions. The pretense is that this activity mimics how humans learn from experience from data. AI systems use algorithms to find patterns in data, which provides the basis for creation of something entirely new. For example, AI would discover what a cat looks like by analyzing thousands, perhaps hundreds of thousands of cat photos found on the internet. Once the model had enough data, he model could then generate new cat images or identify cats in photos based on what it has learned.
AI Generated Content
The rapid consumption of existing data has provided early users of AI with the unique opportunity to develop new content, albeit AI-generated content. Observers of this phenomenon have already begun to wonder what happens when AI-generated becomes more common, possibly to the point that it overtakes original content. Will future AI models be “learning” using data that is AI-generated? If so, what can we expect as consumers of this data?
Loss of originality may not seem like a big problem, that is until it is a big problem. If models are trained using their own output or the output of other AI models, they may gradually reinforce existing patterns instead of discovering new ideas. One area that may become dangerously compromised is the truth in reporting news, especially political news. AI models may present a narrative so strongly that the truth becomes lost. This happens when fake news dominates the web. AI doesn't truly "know" the truth, it simply reflects patterns from its training data. If factual errors in AI-generated content are recycled into training sets, inaccuracies could compound over time. Our future history might be tainted with propaganda or outright lies.
Recycle, Reuse, or Retrain?
Another concern relates to hitting a creativity plateau. Currently we still read articles written by real people, with novel ideas, unique perspectives, and different viewpoints. Photographs are influenced by the artistic expression of the photographer. Scientific work comes from real research, theories, and scientists taking chances on abstract ideas or concepts. Human experience plays an enormous role in each of these examples. If AI begins to train itself, creativity will be lost, possibly sooner than one might expect. AI models will become echo-chambers, subject to their own limitations, eventually leading to a stagnation in creativity and depth. Everything will seem recycled and lacking substance.
So, What Are We to Do?
Top AI developers are already addressing this issue as part of next-generation programs. Some key topics are being integrated to ensure the plateau doesn’t happen. One important component is to tag and trace AI-generated content, allowing it to be filtered out of future training scenarios. To supplement this process, more human feedback plus curated data will be implemented. Curated data refers to data that has been carefully selected, cleaned, organized, and verified before being used. The belief is that using non-corrupted sources will help eliminate the fear of fake news becoming part of the permanent record.
Even though we hear tales of the internet being scraped for all data, there is still plenty of information which AI has not touched. Non-digital sources such as books, interviews, and academic journals will be a key part of enhancing the current collection of working materials. This will certainly improve the understanding of source attribution and originality, giving more credibility to those who created the work, rather than others who might have commented or written about it.
We Are Our Own Saviors
The goal across the industry is to keep current and new models grounded in human knowledge, even as they become capable of generating vast amounts of content. But if the internet becomes saturated with AI-written material, AI-generated images, and videos, things may become trying. Users may lose trust in what they read and see. Educators will struggle to teach facts to students who challenge them with false or skewed online “facts.” Populations may lose their political voice as ideologies replace common-sense beliefs. Even everyday activities may be deemed unsafe by overly aggressive AI tools.
For now, most of us will have to sit and wait, trusting that the designers have our best interest in mind. We can only pray that AI-generated content is clearly labeled and monitored by trustworthy human beings. Originality must remain as the cornerstone of the world wide web; without it corrupt forces can shift the narrative to whatever they wish it to be. AI has immense potential, but like all things, great power must be handled with great responsibility. The focus must remain on amplifying human knowledge and insight, not replacing it. Preventing a creative or factual plateau will require conscious, collaborative effort from developers, researchers, users, and content creators alike. If not, then the world as we know it just might be rewritten and our source of knowledge forever limited by our own hand.
This content is accurate and true to the best of the author’s knowledge and is not meant to substitute for formal and individualized advice from a qualified professional.