Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Censoring Survival Data01:09

Censoring Survival Data

56
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
56

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Development and validation of prostate-specific membrane antigen-adjusted Prostate Imaging Reporting and Data System: A new approach to enhancing the accuracy of diagnosing treatment-naïve primary prostate cancer.

Asian journal of urology·2026
Same author

Label-Free Raman Spectroscopy Reveals Metabolic Signatures Associated with MGMT Promoter Methylation Status in Glioblastoma.

Analytical chemistry·2026
Same author

Integrating multivariate resting-state fMRI features to localize epileptic networks in common childhood epilepsy.

Epilepsia·2026
Same author

Soft Supervision Guided Spatial-Temporal Refinement Network For Video-based Visible-Infrared Person Re-Identification.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

A Diagnosis Model of Typhoon-Related Post-Traumatic Stress Disorder Based on Fixel-Based Analysis in Machine Learning.

Brain and behavior·2026
Same author

Machine-Learning-Enabled Raman Spectroscopy Refines Indocyanine Green Fluorescence Boundaries for Precise Glioblastoma Margin Delineation.

Research (Washington, D.C.)·2026
Same journal

DeepMethylation: A deep learning framework for tissue-specific DNA methylation prediction and functional variant annotation.

PLoS computational biology·2026
Same journal

Redefining and estimating the early-phase reproduction ratio for epidemic outbreaks in spatially structured populations.

PLoS computational biology·2026
Same journal

Optimized phenotype definitions boost GWAS power.

PLoS computational biology·2026
Same journal

Detection, communication, and individual identification with deep audio embeddings: A case study with North Atlantic right whales.

PLoS computational biology·2026
Same journal

Exploring the structural lexicon of the Proteome via Metric Geometry.

PLoS computational biology·2026
Same journal

Linking retinal sampling in neural encoding models to temporal profiles of visual processing in humans.

PLoS computational biology·2026
See all related articles

Related Experiment Video

Updated: May 28, 2025

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma
04:09

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

8.2K

Reliability-enhanced data cleaning in biomedical machine learning using inductive conformal prediction.

Xianghao Zhan1,2, Qinmei Xu2, Yuanning Zheng2

  • 1Department of Bioengineering, Stanford University, Stanford, California, United States of America.

Plos Computational Biology
|February 13, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces a new method using inductive conformal prediction (ICP) to clean noisy biomedical datasets, improving machine learning model performance. The reliability-based approach effectively corrects mislabeled data, enhancing accuracy in diverse applications like DILI literature filtering and disease prediction.

More Related Videos

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

6.7K
An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.0K

Related Experiment Videos

Last Updated: May 28, 2025

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma
04:09

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

8.2K
Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

6.7K
An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.0K

Area of Science:

  • Biomedical Machine Learning
  • Data Science
  • Computational Biology

Background:

  • Accurate labeling of large biomedical datasets is crucial for machine learning but challenging due to data augmentation noise.
  • Existing methods for handling noisy data often require strict assumptions and well-curated datasets.
  • This limitation hinders the development of robust machine learning models in healthcare.

Purpose of the Study:

  • To develop a novel, reliability-based training data cleaning method using inductive conformal prediction (ICP).
  • To address the challenge of noisy labels in large biomedical datasets without strict modeling assumptions.
  • To improve the performance of machine learning models in various biomedical classification tasks.

Main Methods:

  • Proposed a novel reliability-based training data cleaning method employing inductive conformal prediction (ICP).
  • Leveraged ICP-calculated reliability metrics to identify and correct mislabeled data and outliers in noisy datasets.
  • Utilized a small set of well-curated data alongside vast quantities of noisy data.

Main Results:

  • Significantly enhanced downstream classification performance across three distinct biomedical tasks: DILI literature filtering, COVID-19 patient ICU admission prediction, and breast cancer subtyping.
  • Achieved substantial accuracy improvements in DILI experiments (up to 11.4%) and RNA-sequencing experiments (up to 74.6%).
  • Demonstrated significant enhancements in AUROC and AUPRC for COVID-19 prediction (up to 23.8% and 69.8% respectively), and accuracy/F1-score improvements in RNA-sequencing data.

Conclusions:

  • The proposed ICP-based data cleaning method effectively improves classification performance in biomedical machine learning tasks.
  • This approach reduces the need for extensive well-curated data and avoids strong distributional or modeling assumptions.
  • The method offers statistically and clinically significant improvements for information retrieval, disease diagnosis, and prognosis.