Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

40
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
40

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Using connectome-based predictive models to reveal the systems standardized tests and clinical symptoms are reflecting.

Nature communications·2026
Same author

Effect sizes in human functional neuroimaging.

Research square·2026
Same author

The Hidden Landscape of Missed Effects in Human Functional Neuroimaging.

bioRxiv : the preprint server for biology·2026
Same author

External validation improves generalizability, replicability and reproducibility in predictive models for neuroimaging.

Nature methods·2026
Same author

Optimizing functional connectivity scanning conditions for predicting autistic traits.

Nature. Mental health·2026
Same author

The metabolomic signatures mediate associations between physical frailty and metabolic dysfunction-associated steatotic liver disease.

Science advances·2026
Same journal

Demonstration of a quantum C-NOT gate in a time-multiplexed fully reconfigurable photonic processor.

Nature communications·2026
Same journal

Nonlinear quantum light source with van der Waals ferroelectric NbOX<sub>2</sub> (X = Br, I).

Nature communications·2026
Same journal

Antagonistic histone H2A variants and autonomous heterochromatin formation shape epigenomic patterns in Arabidopsis.

Nature communications·2026
Same journal

The long tail of nitrate pollution in groundwater challenges governance of global water quality.

Nature communications·2026
Same journal

Select microbial metabolites promote tau aggregation in a murine tauopathy model.

Nature communications·2026
Same journal

Warming climate has lengthened global intense tropical cyclone seasons.

Nature communications·2026
See all related articles

Related Experiment Video

Updated: Jul 2, 2025

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.7K

Data leakage inflates prediction performance in connectome-based machine learning models.

Matthew Rosenblatt1, Link Tejavibulya2, Rongtao Jiang3

  • 1Department of Biomedical Engineering, Yale University, New Haven, CT, USA. matthew.rosenblatt@yale.edu.

Nature Communications
|February 28, 2024
PubMed
Summary
This summary is machine-generated.

Data leakage in neuroimaging predictive models, especially via feature selection and repeated subjects, inflates performance. Avoiding leakage is crucial for valid and reproducible brain-behavior research, particularly with small datasets.

More Related Videos

Network Analysis of the Default Mode Network Using Functional Connectivity MRI in Temporal Lobe Epilepsy
12:09

Network Analysis of the Default Mode Network Using Functional Connectivity MRI in Temporal Lobe Epilepsy

Published on: August 5, 2014

18.1K
Statistical Modelling of Cortical Connectivity Using Non-invasive Electroencephalograms
08:51

Statistical Modelling of Cortical Connectivity Using Non-invasive Electroencephalograms

Published on: November 1, 2019

5.6K

Related Experiment Videos

Last Updated: Jul 2, 2025

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.7K
Network Analysis of the Default Mode Network Using Functional Connectivity MRI in Temporal Lobe Epilepsy
12:09

Network Analysis of the Default Mode Network Using Functional Connectivity MRI in Temporal Lobe Epilepsy

Published on: August 5, 2014

18.1K
Statistical Modelling of Cortical Connectivity Using Non-invasive Electroencephalograms
08:51

Statistical Modelling of Cortical Connectivity Using Non-invasive Electroencephalograms

Published on: November 1, 2019

5.6K

Area of Science:

  • Neuroimaging
  • Machine Learning
  • Computational Neuroscience

Background:

  • Predictive modeling in neuroimaging aims to uncover brain-behavior relationships and ensure generalizability.
  • Data leakage, a breach in training-test data separation, compromises model validity and is prevalent in machine learning.
  • Understanding leakage effects is vital for evaluating existing neuroimaging literature.

Purpose of the Study:

  • To investigate the impact of five distinct data leakage types on neuroimaging predictive models.
  • To assess how leakage affects functional and structural connectome-based machine learning.
  • To determine the influence of dataset size on leakage effects.

Main Methods:

  • Examined five forms of data leakage: feature selection, covariate correction, and inter-subject dependencies.
  • Applied machine learning models to functional and structural connectome data across four datasets.
  • Evaluated model performance on three distinct phenotypes.

Main Results:

  • Leakage through feature selection and repeated subjects significantly inflated prediction performance.
  • Other leakage forms demonstrated minimal impact on predictive model accuracy.
  • Smaller datasets amplified the detrimental effects of data leakage.

Conclusions:

  • Data leakage has variable effects on neuroimaging predictive models, with some forms being more detrimental than others.
  • Feature selection and subject duplication are critical leakage points to avoid.
  • Ensuring data integrity by preventing leakage is essential for enhancing the validity and reproducibility of neuroimaging research.