What is Model Collapse?

TL;DR

The degradation of AI output quality that occurs when models are repeatedly trained on their own generated data. Related to the so-called '2026 problem.'

Model Collapse: Definition & Explanation

Model Collapse is the phenomenon where an AI model's output quality and diversity progressively deteriorate when it is repeatedly trained on data generated by AI itself (synthetic data). Specifically, rare patterns and minority representations in the training data are lost over successive generations, resulting in increasingly homogeneous and average outputs. This issue has been dubbed the '2026 problem' — as the proportion of AI-generated content on the internet surges, there is a growing risk that the supply of high-quality 'human-written text' needed for LLM training could dry up. Research organizations like Epoch AI have predicted that high-quality training text data could be exhausted around 2026. Countermeasures under investigation include training data filtering (excluding AI-generated content), synthetic data quality management, human data curation, and federated learning approaches.

Related Terms

AI Marketing Tools by Our Team