Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...

One-Way ANOVA: Unequal Sample Sizes

One-Way ANOVA: Unequal Sample Sizes

One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Sample Size Calculation

Sample Size Calculation

Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of...

Sampling Plans

Sampling Plans

Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a survival tree begins...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

AI-Based Pathology classifier Predicts Sensitivity to Enzalutamide in Metastatic Hormone-Sensitive Prostate Cancer: A Biomarker Analysis of the ENZAMET Trial.

Clinical cancer research : an official journal of the American Association for Cancer Research·2026

Same author

Precision medicine's inevitable trajectory toward rare-disease-sized cohorts: implications for machine learning and deep learning.

The Lancet. Digital health·2026

Same author

Promise to Practice: Reimagining Artificial Intelligence for Equitable Global Health Impact.

Annals of global health·2026

Same author

Reply to Z Yu and F Qin.

The American journal of clinical nutrition·2026

Same author

Artificial Intelligence-informed Architectural Insights of 3-dimensional Glandular Networks Identify Patients With Prostate Cancer at a Higher Risk of Biochemical Recurrence.

Modern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc·2026

Same author

An integrated clinical-histopathologic prediction model for cardiac allograft rejection: Translating machine learning into clinical risk frameworks.

The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation·2026

Same journal

Analysis of End-Tidal CO2 Variability During Plateau Waves Episodes: An Information Theoretic Approach<sup></sup>.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2025

Same journal

AI and Tomosynthesis for Breast Cancer Molecular Subtyping: A step toward precision medicine<sup></sup>.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2025

Same journal

Towards Sustainable Protein Recovery from Biological Waste: Assessing Polyethersulfone-based Microfiltration.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2025

Same journal

Analysis of the cardiovascular response to standardized polymicrobial peritonitis experimental model.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2025

Same journal

Automated Wrist Ultrasound Image Bone Enhancement and Segmentation Using Deep Learning.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2025

Same journal

A Deep Learning approach for Depressive Symptoms assessment in Parkinson's disease patients using facial videos.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2025

See all related articles

Search research articles

Related Experiment Video

Updated: May 25, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Evaluating feature selection strategies for high dimensional, small sample size datasets.

Abhishek Golugula¹, George Lee, Anant Madabhushi

¹Department of Electrical and Computer Engineering, Rutgers University, Piscataway, New Jersey 08854, USA.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference

|January 19, 2012

Summary

This summary is machine-generated.

Feature selection schemes for high-dimensional biomedical data with small sample sizes should prioritize robustness alongside accuracy. The Wilcoxon Rank Sum Test demonstrated superior performance in both classification accuracy and robustness for these challenging datasets.

More Related Videos

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Related Experiment Videos

Last Updated: May 25, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Area of Science:

Biomedical Informatics
Bioinformatics
Computational Biology

Background:

High-dimensional (HD) biomedical datasets, common in genomics, often present a small sample size (SSS).
Traditional evaluation of feature selection (FS) schemes using classifier accuracy may be insufficient for HD/SSS data due to the curse of dimensionality and potential lack of representativeness.
Robustness, defined as invariance to training data changes, is a critical but often overlooked metric for FS schemes in HD/SSS contexts.

Purpose of the Study:

To analyze and evaluate different strategies for comparing FS schemes on HD/SSS biomedical datasets.
To introduce and utilize a new performance measure, Robustness, for FS schemes.
To compare the efficacy of five distinct FS schemes on diverse HD/SSS gene and protein expression datasets.

Main Methods:

Quantitative comparison of five FS schemes: T-test, F-test, Kolmogorov-Smirnov Test, Wilks Lambda Test, and Wilcoxon Rank Sum Test.
Evaluation using classifier accuracy (K-Nearest Neighbor and Random Forest) and the newly defined Robustness measure.
Application to five HD/SSS biomedical datasets from cancer, bone lesions, celiac disease, and coronary heart disease studies.

Main Results:

The Wilcoxon Rank Sum Test significantly outperformed the other four FS schemes in both classification accuracy and robustness.
Demonstrated that classifier accuracy alone is not always sufficient for evaluating FS schemes on HD/SSS datasets.
Highlighted the importance of the Robustness metric in assessing FS scheme reliability for HD/SSS data.

Conclusions:

Both classifier accuracy and Robustness are essential considerations when selecting an appropriate FS scheme for HD/SSS biomedical datasets.
The Wilcoxon Rank Sum Test emerges as a highly effective FS scheme for HD/SSS biomedical data.
Future evaluations of FS schemes for HD/SSS data should incorporate Robustness as a key performance indicator.