Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Survival Tree01:19

Survival Tree

277
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
277
Overview Of Cell Separation And Isolation01:20

Overview Of Cell Separation And Isolation

6.8K
Cell separation was first achieved in 1964 by S. H. Seal, who separated large tumor cells from the smaller blood cells using filtration. Two years later, Pohl and Hawk performed experiments on how cells respond differently to a nonuniform electric field based on the cell type. Such observations were the inception of cell separation methods, which allow isolating a single cell type from a heterogeneous sample.
6.8K
EPS and iPS Cells in Disease Research01:21

EPS and iPS Cells in Disease Research

3.2K
Embryonic and induced pluripotent stem cells are excellent models for disease research because of their ability to self-renew and differentiate into most cell types. Somatic cells from a patient are isolated and reprogrammed into induced pluripotent stem cells or iPSCs. These iPSCs are later differentiated into the desired cell type, which mirrors the diseased cell of the patient. In this way, disease models have been created for investigating diseases such as Down syndrome, type I diabetes,...
3.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Prioritizing peptides for targeted mass spectrometry experiments using deep learning.

bioRxiv : the preprint server for biology·2026
Same author

Embryo-scale Visual Cell Sorting reveals a conserved transcriptomic signature of nucleolar size linked to proteostasis.

bioRxiv : the preprint server for biology·2026
Same author

MORC3 represses a tandem repeat enhancer to regulate interferon.

The EMBO journal·2026
Same author

SCOT+: a comprehensive software suite for single-cell alignment using optimal transport.

Bioinformatics advances·2026
Same author

Prediction and functional interpretation of inter-chromosomal genome architecture from DNA sequence with TwinC.

Nature communications·2026
Same author

An snRNA-seq aging clock for the fruit fly head sheds light on sex-biased aging.

Scientific reports·2026
Same journal

Integrated lipidomic and transcriptomic profiling of the host response in human malaria.

Genome biology·2026
Same journal

Centromeric satellite expansion drives genome evolution in the snowy owl.

Genome biology·2026
Same journal

Mapping the landscape of allele-specific expression in porcine genomes.

Genome biology·2026
Same journal

Genomic sequence evolution underlying human neocortical interareal diversification.

Genome biology·2026
Same journal

Regulatory mechanisms driven by functional 3'-UTR variants in alcohol use disorder and related traits.

Genome biology·2026
Same journal

A longitudinal single-nucleus transcriptomic atlas of bovine placentation reveals dynamic cellular hierarchies and regulatory programs.

Genome biology·2026
See all related articles

Related Experiment Video

Updated: Nov 29, 2025

Author Spotlight: Enhancing PSC-to-Functional Cell Differentiation Using ML Models Based on Live-Cell Bright-Field Imaging
11:38

Author Spotlight: Enhancing PSC-to-Functional Cell Differentiation Using ML Models Based on Live-Cell Bright-Field Imaging

Published on: October 4, 2024

907

A pitfall for machine learning methods aiming to predict across cell types.

Jacob Schreiber1, Ritambhara Singh2,3, Jeffrey Bilmes1,4

  • 1Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, USA.

Genome Biology
|November 20, 2020
PubMed
Summary
This summary is machine-generated.

Machine learning models for genomic activity prediction can appear accurate by memorizing data, not generalizing. This study identifies this pitfall and offers solutions for reliable cross-cell type predictions.

Keywords:
EpigenomicsGenomicsMachine learning

More Related Videos

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons
09:21

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Published on: July 7, 2023

1.9K
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

1.7K

Related Experiment Videos

Last Updated: Nov 29, 2025

Author Spotlight: Enhancing PSC-to-Functional Cell Differentiation Using ML Models Based on Live-Cell Bright-Field Imaging
11:38

Author Spotlight: Enhancing PSC-to-Functional Cell Differentiation Using ML Models Based on Live-Cell Bright-Field Imaging

Published on: October 4, 2024

907
Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons
09:21

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Published on: July 7, 2023

1.9K
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

1.7K

Area of Science:

  • Genomics
  • Bioinformatics
  • Machine Learning

Background:

  • Machine learning models are crucial for predicting genomic activity.
  • Accurate predictions across diverse cell types are essential for model utility.
  • Overfitting can lead to misleading performance metrics in genomic prediction models.

Purpose of the Study:

  • To identify and explain a common pitfall in machine learning models for genomic activity prediction.
  • To demonstrate how models can falsely appear accurate due to memorizing genomic loci.
  • To propose methods for diagnosing and avoiding this overfitting issue.

Main Methods:

  • Training and testing machine learning models on genomic data.
  • Analyzing model performance when training and test sets share identical genomic loci.
  • Evaluating predictions of gene expression and chromatin domain boundaries.

Main Results:

  • Models trained and tested on the same genomic loci can exhibit inflated performance metrics.
  • This phenomenon arises from the model memorizing locus-specific average activity.
  • The issue was observed in both gene expression and chromatin boundary prediction tasks.

Conclusions:

  • Over-reliance on identical genomic loci in training and testing sets can create a false impression of model generalizability.
  • Awareness and implementation of diagnostic methods are crucial for developing robust genomic prediction models.
  • Future large-scale genomic projects must proactively address this memorization pitfall to ensure reliable predictions across cell types.