Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

How Data are Classified: Categorical Data01:11

How Data are Classified: Categorical Data

31.1K
A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...
31.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

EBV strain interacts with host HLA to drive nasopharyngeal carcinoma risk.

Nature·2026
Same author

Rare coding and noncoding variants map 1,342 diseases and biomarkers in 490,549 whole genomes.

medRxiv : the preprint server for health sciences·2026
Same author

Comparison of variant callers using 60 532 multi-ancestry whole genome sequences.

Briefings in bioinformatics·2026
Same author

Scalable and accurate rare-variant association tests for whole genome sequencing time-to-event analysis in large biobanks.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same author

cellSTAAR: incorporating single-cell-sequencing-based functional data to boost power in rare variant association testing of noncoding regions.

Nature methods·2025
Same author

FAVOR 2.0: A reengineered functional annotation of variants online resource for interpreting genomic variation.

Nucleic acids research·2025
Same journal

Tau protein as a regulator of mitochondrial function and dynamics.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

A scalable, dividing cell model for the robust propagation and quantification of human sporadic Creutzfeldt-Jakob disease prions.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Epigenetic regulation of mesenchymal BMP signaling directs postnatal organ innervation.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Single-shot wide-field biochemical imaging at 1 kHz frame rate.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Morphogenesis and topological evolution of a frustrated nematic liquid crystal under confinement.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

B cell-intrinsic CXCR3 drives efficient generation of ectopic pulmonary germinal center responses to influenza A virus infection.

Proceedings of the National Academy of Sciences of the United States of America·2026
See all related articles

Related Experiment Video

Updated: May 12, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K

LACE-UP: An ensemble machine-learning method for health subtype classification on multidimensional binary data.

Rebecca Danning1, Frank B Hu2, Xihong Lin1,3

  • 1Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02215.

Proceedings of the National Academy of Sciences of the United States of America
|April 23, 2025
PubMed
Summary
This summary is machine-generated.

A new machine learning method, LACE-UP, effectively identifies disease and behavior subtypes from complex binary data. This approach enhances subtype discovery without needing to pre-set cluster numbers, outperforming existing methods in realistic scenarios.

Keywords:
UMAPcluster analysisdisease and behavior subtypesensemble learningnonlinear dimensionality reduction

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.4K
Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts
08:51

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

1.0K

Related Experiment Videos

Last Updated: May 12, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.4K
Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts
08:51

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

1.0K

Area of Science:

  • Biomedical research
  • Machine learning
  • Data science

Background:

  • Subtype identification is crucial in biomedical research.
  • Existing clustering methods struggle with multidimensional binary data.
  • Lack of robust statistical methods limits subtype discovery.

Purpose of the Study:

  • Introduce LACE-UP (Latent Class Analysis ENsembled with UMAP and PCA) for robust binary data clustering.
  • Develop a method that does not require prespecifying the number of clusters.
  • Address challenges like correlated and unrelated variables in subtype discovery.

Main Methods:

  • Ensemble machine-learning approach combining Latent Class Analysis (LCA), Principal Components Analysis (PCA), and Uniform Manifold Approximation and Projection (UMAP).
  • LCA provides model-based clustering.
  • PCA offers spectral signal processing, and UMAP provides model-free dimensionality reduction.

Main Results:

  • LACE-UP demonstrates superior performance compared to gold-standard techniques in simulations across various realistic data settings.
  • The method is robust to correlated and extraneous variables.
  • Application to UK Biobank dietary data revealed interpretable dietary subtypes linked to cardiovascular risk.

Conclusions:

  • LACE-UP offers a powerful and robust solution for clustering multidimensional binary data.
  • The method facilitates more accurate and interpretable disease and behavior subtype discovery.
  • This approach has significant implications for understanding health behaviors and associated risks.