Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

13.5K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
13.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Statistics and AI - A Fireside Conversation.

Harvard data science review·2026
Same author

Predicting the timing of first sustained cognitive worsening in Alzheimer's disease using real-world clinical data and machine learning.

medRxiv : the preprint server for health sciences·2026
Same author

Nonparametric estimation of the total treatment effect with multiple outcomes in the presence of terminal events.

Biometrics·2026
Same author

Stratification of Alzheimer's disease patients using knowledge-guided unsupervised latent factor clustering with electronic health record data.

Communications medicine·2026
Same author

Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same author

Phenotypic prediction of missense variants via deep contrastive learning.

Nature biomedical engineering·2026
Same journal

Classification Under Local Differential Privacy with Model Reversal and Model Averaging.

Journal of machine learning research : JMLR·2026
Same journal

Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data.

Journal of machine learning research : JMLR·2026
Same journal

Heterogeneity-aware Clustered Distributed Learning for Multi-source Data Analysis.

Journal of machine learning research : JMLR·2026
Same journal

Unsupervised Tree Boosting for Learning Probability Distributions.

Journal of machine learning research : JMLR·2026
Same journal

A Two-Stage Penalized Least Squares Method for Constructing Large Systems of Structural Equations.

Journal of machine learning research : JMLR·2026
Same journal

Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes.

Journal of machine learning research : JMLR·2026
See all related articles

Related Experiment Video

Updated: Jul 11, 2025

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.1K

Prior Adaptive Semi-supervised Learning with Application to EHR Phenotyping.

Yichi Zhang1, Molei Liu2, Matey Neykov3

  • 1Department of Computer Science and Statistics, University of Rhode Island.

Journal of Machine Learning Research : JMLR
|November 17, 2023
PubMed
Summary
This summary is machine-generated.

Electronic Health Record (EHR) data can advance disease research but lack precise phenotype information. A new semi-supervised method improves EHR phenotyping by using both labeled and weakly labeled data, enhancing discovery research.

Keywords:
High dimensional sparse regressionelectronic health recordsregularizationsemi-supervised learningsingle index model

More Related Videos

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons
09:21

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Published on: July 7, 2023

1.5K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K

Related Experiment Videos

Last Updated: Jul 11, 2025

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.1K
Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons
09:21

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Published on: July 7, 2023

1.5K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K

Area of Science:

  • Biomedical Informatics
  • Computational Biology
  • Health Services Research

Background:

  • Electronic Health Records (EHR) offer rich data for biomedical research but are underutilized due to imprecise phenotype information.
  • Supervised learning methods for EHR phenotyping require large labeled datasets, which are often unavailable.
  • Existing methods struggle with large feature sets and limited gold-standard labeled data.

Purpose of the Study:

  • To develop a novel semi-supervised (SS) EHR phenotyping method to address the limitations of supervised approaches.
  • To improve the accuracy and generalizability of phenotype prediction from EHR data.
  • To leverage both small labeled and large weakly-labeled datasets for enhanced phenotyping.

Main Methods:

  • Proposed a semi-supervised (SS) EHR phenotyping approach utilizing a small labeled dataset and a large weakly-labeled dataset.
  • Introduced a prior adaptive semi-supervised (PASS) estimator that incorporates prior knowledge by shrinking towards a derived direction.
  • Derived asymptotic theory to justify the estimator's efficiency and robustness, even with imperfect prior information.

Main Results:

  • The proposed PASS estimator demonstrated superiority over existing methods in simulation studies.
  • The method proved effective and robust across various scenarios, including those with poor quality prior information.
  • Validation on three real-world EHR phenotyping studies at a major tertiary hospital confirmed its practical utility.

Conclusions:

  • The developed semi-supervised (SS) EHR phenotyping method, particularly the PASS estimator, significantly enhances phenotype prediction accuracy.
  • This approach effectively overcomes the limitations of small labeled datasets in EHR research.
  • The findings suggest a promising direction for advancing discovery research using Electronic Health Record data.