Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Censoring Survival Data

Censoring Survival Data

Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Development and validation of prostate-specific membrane antigen-adjusted Prostate Imaging Reporting and Data System: A new approach to enhancing the accuracy of diagnosing treatment-naïve primary prostate cancer.

Asian journal of urology·2026

Same author

Label-Free Raman Spectroscopy Reveals Metabolic Signatures Associated with MGMT Promoter Methylation Status in Glioblastoma.

Analytical chemistry·2026

Same author

Integrating multivariate resting-state fMRI features to localize epileptic networks in common childhood epilepsy.

Epilepsia·2026

Same author

Soft Supervision Guided Spatial-Temporal Refinement Network For Video-based Visible-Infrared Person Re-Identification.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

A Diagnosis Model of Typhoon-Related Post-Traumatic Stress Disorder Based on Fixel-Based Analysis in Machine Learning.

Brain and behavior·2026

Same author

Machine-Learning-Enabled Raman Spectroscopy Refines Indocyanine Green Fluorescence Boundaries for Precise Glioblastoma Margin Delineation.

Research (Washington, D.C.)·2026

Same journal

DeepMethylation: A deep learning framework for tissue-specific DNA methylation prediction and functional variant annotation.

PLoS computational biology·2026

Same journal

Redefining and estimating the early-phase reproduction ratio for epidemic outbreaks in spatially structured populations.

PLoS computational biology·2026

Same journal

Optimized phenotype definitions boost GWAS power.

PLoS computational biology·2026

Same journal

Detection, communication, and individual identification with deep audio embeddings: A case study with North Atlantic right whales.

PLoS computational biology·2026

Same journal

Exploring the structural lexicon of the Proteome via Metric Geometry.

PLoS computational biology·2026

Same journal

Linking retinal sampling in neural encoding models to temporal profiles of visual processing in humans.

PLoS computational biology·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 28, 2025

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

Reliability-enhanced data cleaning in biomedical machine learning using inductive conformal prediction.

Xianghao Zhan^1,2, Qinmei Xu², Yuanning Zheng²

¹Department of Bioengineering, Stanford University, Stanford, California, United States of America.

Plos Computational Biology

|February 13, 2025

Summary

This summary is machine-generated.

This study introduces a new method using inductive conformal prediction (ICP) to clean noisy biomedical datasets, improving machine learning model performance. The reliability-based approach effectively corrects mislabeled data, enhancing accuracy in diverse applications like DILI literature filtering and disease prediction.

More Related Videos

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Related Experiment Videos

Last Updated: May 28, 2025

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Area of Science:

Biomedical Machine Learning
Data Science
Computational Biology

Background:

Accurate labeling of large biomedical datasets is crucial for machine learning but challenging due to data augmentation noise.
Existing methods for handling noisy data often require strict assumptions and well-curated datasets.
This limitation hinders the development of robust machine learning models in healthcare.

Purpose of the Study:

To develop a novel, reliability-based training data cleaning method using inductive conformal prediction (ICP).
To address the challenge of noisy labels in large biomedical datasets without strict modeling assumptions.
To improve the performance of machine learning models in various biomedical classification tasks.

Main Methods:

Proposed a novel reliability-based training data cleaning method employing inductive conformal prediction (ICP).
Leveraged ICP-calculated reliability metrics to identify and correct mislabeled data and outliers in noisy datasets.
Utilized a small set of well-curated data alongside vast quantities of noisy data.

Main Results:

Significantly enhanced downstream classification performance across three distinct biomedical tasks: DILI literature filtering, COVID-19 patient ICU admission prediction, and breast cancer subtyping.
Achieved substantial accuracy improvements in DILI experiments (up to 11.4%) and RNA-sequencing experiments (up to 74.6%).
Demonstrated significant enhancements in AUROC and AUPRC for COVID-19 prediction (up to 23.8% and 69.8% respectively), and accuracy/F1-score improvements in RNA-sequencing data.

Conclusions:

The proposed ICP-based data cleaning method effectively improves classification performance in biomedical machine learning tasks.
This approach reduces the need for extensive well-curated data and avoids strong distributional or modeling assumptions.
The method offers statistically and clinically significant improvements for information retrieval, disease diagnosis, and prognosis.