Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Bootstrapping

Bootstrapping

The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is...

Ranks

Ranks

Unlike parametric methods, nonparametric statistics are ideal for nominal and ordinal data, requiring fewer assumptions about the population's nature or distribution. This makes nonparametric methods easier to apply and interpret, as they do not depend on parameters like mean or standard deviation. One common approach in nonparametric analysis is to sort data according to a specific criterion. For instance, we might arrange weather data from hottest to coldest days in a month or rank cities...

Wilcoxon Rank-Sum Test

Wilcoxon Rank-Sum Test

The Wilcoxon rank-sum test, also known as the Mann-Whitney U test, is a nonparametric test used to determine if there is a significant difference between the distributions of two independent samples. This test is designed specifically for two independent populations and has the following key requirements:

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Editorial: Cancer Immunosurveillance.

Frontiers in immunology·2026

Same author

Germline variants in cancer susceptibility genes among patients with mucosal melanoma.

NPJ genomic medicine·2026

Same author

The abscopal effect of IRE combined with anti-PD-1 achieves local ablation and systemic control of PDAC.

bioRxiv : the preprint server for biology·2026

Same author

A shape-constrained regression and wild bootstrap framework for reproducible drug synergy testing.

bioRxiv : the preprint server for biology·2026

Same author

Association of Blood Levels of Forever Plastics with Lung Cancer Mortality among Ever Smokers in the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cohort Study.

Clinical cancer research : an official journal of the American Association for Cancer Research·2026

Same author

Systemic Immune Correlates of Long-term Survival after Delta-24-RGD Based on the Therapeutic Adenovirus for Recurrent Glioblastoma Effect Trial (TARGET).

Clinical cancer research : an official journal of the American Association for Cancer Research·2026

Same journal

Optimal transport for label transfer in single-cell multi-omics integration.

Briefings in bioinformatics·2026

Same journal

Continuous multi-omics pathway enrichment analysis resolves hidden functional heterogeneity.

Briefings in bioinformatics·2026

Same journal

Evaluating completeness, coherence, and consistency of genome-scale function annotations.

Briefings in bioinformatics·2026

Same journal

Transformers for single-cell RNA sequencing: a survey.

Briefings in bioinformatics·2026

Same journal

CLABP: a contrastive learning framework integrating protein language models and structural information for antibacterial peptide prediction.

Briefings in bioinformatics·2026

Same journal

Toward the regularization of E value from BLAST similarity search into a dissimilarity measure as distance function, and the metrication of protein sequence space.

Briefings in bioinformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 20, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Rank-based learning: a novel high-throughput algorithm resilient to missing data and effective for datasets with

Lulu Song¹, Hamid Khoshfekr Rudsari¹, Johannes F Fahrmann²

¹Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA.

Briefings in Bioinformatics

|February 18, 2026

Summary

This summary is machine-generated.

A new Rank-Based Learning (RBL) method improves omics data classification by using feature rankings, outperforming other methods on cancer datasets. RBL offers a robust approach for reliable diagnostic tools.

Keywords:

high-throughput omics machine learning missing data rank-based learning

More Related Videos

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Related Experiment Videos

Last Updated: Feb 20, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Area of Science:

Bioinformatics
Computational Biology
Machine Learning

Background:

High-throughput omics data present classification challenges due to platform variability, batch effects, missing values, and high dimensionality.
Existing methods struggle with noise and inconsistencies inherent in omics data, limiting diagnostic model reliability.

Purpose of the Study:

To introduce and evaluate a novel Rank-Based Learning (RBL) method for binary classification of high-throughput omics data.
To enhance the robustness and generalizability of diagnostic models by leveraging relative feature rankings.

Main Methods:

Developed a Rank-Based Learning (RBL) algorithm focusing on relative feature rankings.
Evaluated RBL against Logistic Regression (LR) and Random Forest (RF) using simulated data.
Validated RBL on two real-world plasma proteomics datasets: small cell lung cancer (SCLC) and duodenopancreatic neuroendocrine tumors (dpNET) in MEN1 patients.

Main Results:

RBL outperformed LR and RF in simulation experiments, particularly under batch effects and missing data conditions.
In SCLC classification, RBL achieved a test AUC of 0.76, superior to LR (0.65) and RF (0.59).
For dpNET, RBL demonstrated strong performance with an AUC of 0.80 on the test set, outperforming LR (0.57) and RF (0.53).

Conclusions:

Rank-Based Learning (RBL) effectively mitigates non-biological variation by emphasizing feature rankings over absolute expression levels.
RBL significantly improves predictive accuracy for diagnostic models using complex omics data.
The RBL framework offers a promising avenue for developing more reliable and clinically applicable omics-based diagnostic tools.