Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Survival Tree01:19

Survival Tree

75
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
75
Clearance Models: Compartment Models01:25

Clearance Models: Compartment Models

64
Clearance measures drug elimination from the central compartment, including plasma and highly perfused organs like kidneys and liver. Its calculation varies depending on pharmacokinetic models and administration routes. The one-compartment model, for instance, portrays the pharmacokinetics of polar drugs such as aminoglycoside antibiotics administered intravenously and readily excreted in urine. In this case, clearance is influenced by the terminal rate constant (λz) and the total volume...
64
Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

47
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
47
Collisions in Multiple Dimensions: Problem Solving01:06

Collisions in Multiple Dimensions: Problem Solving

3.7K
In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...
3.7K
Multicompartment Models: Overview01:14

Multicompartment Models: Overview

121
Multicompartment models are mathematical constructs that depict how drugs are distributed and eliminated within the body. They segment the body into several compartments, symbolizing various physiological or anatomical areas connected through drug transfer processes such as absorption, metabolism, distribution, and elimination.
These models offer a more comprehensive representation of drug behavior in the body than one-compartment models. They accommodate the complexity of drug distribution,...
121
Per-Unit Sequence Models01:26

Per-Unit Sequence Models

73
An ideal Y-Y transformer, grounded through neutral impedances, displays per-unit sequence networks akin to those of a single-phase ideal transformer when subjected to balanced positive- or negative-sequence currents. These currents do not produce neutral currents, and their associated voltage drops.
Zero-sequence currents, which are identical in magnitude and phase, generate a neutral current, resulting in voltage drops across the neutral impedance and the low-voltage winding. If the...
73

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Scalable watermarking for identifying large language model outputs.

Nature·2024
Same author

Detecting hallucinations in large language models using semantic entropy.

Nature·2024
Same author

Mining Bodily Cues to Deception.

Journal of nonverbal behavior·2024
Same author

Decentralised, collaborative, and privacy-preserving machine learning for multi-hospital data.

EBioMedicine·2024
Same author

ProteinNPT: Improving Protein Property Prediction and Design with Non-Parametric Transformers.

bioRxiv : the preprint server for biology·2023
Same author

ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction.

bioRxiv : the preprint server for biology·2023
Same journal

Harmonizing standards and resources for the medical genome.

Nature·2026
Same journal

Towards the construction of a virtual yeast.

Nature·2026
Same journal

Aerosols and hydrocarbons in the atmosphere of a white dwarf planet.

Nature·2026
Same journal

TROP2 targeting reveals therapy-driven cell state dynamics in colorectal cancer.

Nature·2026
Same journal

Competing programs shape cortical sensorimotor-association axis development.

Nature·2026
Same journal

Steatosis shapes prognosis-defining liver metastasis heterogeneity in CRC.

Nature·2026
See all related articles

Related Experiment Video

Updated: Jun 19, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

539

AI models collapse when trained on recursively generated data.

Ilia Shumailov1, Zakhar Shumaylov2, Yiren Zhao3

  • 1OATML, Department of Computer Science, University of Oxford, Oxford, UK. ilia.shumailov@chch.ox.ac.uk.

Nature
|July 24, 2024
PubMed
Summary
This summary is machine-generated.

Generative artificial intelligence (AI) models trained on their own output can suffer irreversible defects, a phenomenon called model collapse. This impacts the quality and diversity of future AI-generated content.

More Related Videos

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

2.7K
Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.0K

Related Experiment Videos

Last Updated: Jun 19, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

539
Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

2.7K
Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.0K

Area of Science:

  • Artificial Intelligence
  • Machine Learning
  • Generative Models

Background:

  • Generative artificial intelligence (AI), including large language models (LLMs) like GPT-4 and image generation models like stable diffusion, is rapidly transforming online content.
  • The widespread use of AI-generated text and images raises questions about the future of data used for training these models.
  • Previous AI advancements, such as GPT-2, GPT-3(.5), and GPT-4, have shown significant capabilities in various language tasks.

Purpose of the Study:

  • To investigate the potential impact of large language models (LLMs) on future AI training data.
  • To identify and analyze the defects that arise when AI models are trained on model-generated content.
  • To understand the implications of these defects for the sustainability of AI development and the value of diverse data sources.

Main Methods:

  • Theoretical analysis of generative models, including LLMs, variational autoencoders (VAEs), and Gaussian mixture models (GMMs).
  • Simulation and empirical studies to demonstrate the occurrence and effects of model collapse.
  • Investigation into the loss of data distribution tails when models are trained on synthetic data.

Main Results:

  • Indiscriminate use of model-generated content in training leads to irreversible defects in AI models.
  • The phenomenon, termed 'model collapse,' causes the disappearance of the tails of the original data distribution.
  • Model collapse is shown to be a ubiquitous issue across various types of generative models, including LLMs, VAEs, and GMMs.

Conclusions:

  • Model collapse poses a significant threat to the long-term quality and diversity of AI-generated content.
  • Sustaining the benefits of training on large-scale web data requires addressing model collapse.
  • Data reflecting genuine human interactions will become increasingly valuable as a countermeasure to model collapse in AI training.