Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

What should be expected from feature selection in small-sample settings.

Chao Sima¹, Edward R Dougherty

¹Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.

Bioinformatics (Oxford, England)

|July 28, 2006

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Investigating the molecular mechanisms of resveratrol in treating diabetic foot ulcers: a comprehensive analysis of network pharmacology and experiment validation.

Frontiers in molecular biosciences·2025

Same author

AuNRs-PPARγmAb Induce Targeted Adipocyte Apoptosis Through Photothermal Effects for Effective Localized Fat Reduction.

International journal of nanomedicine·2025

Same author

Correction: facilitation of diabetic wound healing by far upstream element binding protein 1 through augmentation of dermal fibroblast activity.

Acta diabetologica·2025

Same author

Facilitation of diabetic wound healing by far upstream element binding protein 1 through augmentation of dermal fibroblast activity.

Acta diabetologica·2024

Same author

Pathway-based analyses of gene expression profiles at low doses of ionizing radiation.

Frontiers in bioinformatics·2024

Same author

Optimal decision-making in high-throughput virtual screening pipelines.

Patterns (New York, N.Y.)·2023

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Feature selection in high-dimensional biological data may not yield optimal results. Failure to find a good feature set does not mean one does not exist, especially in breast cancer prognosis.

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

High-throughput biological data presents challenges due to high dimensionality and small sample sizes.
Feature selection is crucial for diagnosis and prognosis but can be unreliable under these conditions.
Key questions address the reliability of feature selection methods in finding optimal feature sets.

Purpose of the Study:

To investigate whether feature selection methods can reliably identify optimal feature sets.
To determine if the failure to find a good feature set implies the non-existence of such sets.
To provide practical insights for interpreting feature selection results in biological studies.

Main Methods:

Employed three classification rules: linear discriminant analysis, linear support vector machine, and k-nearest-neighbor classification.

Related Experiment Videos

Utilized sequential floating forward search and t-test for feature selection.

Applied methods to three feature-label models and breast cancer survival prognosis data.

Main Results:

Feature selection methods are unlikely to yield feature sets with errors close to optimal.
The inability to find a good feature set does not preclude the existence of suitable feature sets.
Results were consistent across different classification rules and feature selection techniques.

Conclusions:

Experimenters should not conclude that optimal feature sets do not exist solely based on the failure of current selection methods.
The findings have practical implications for interpreting the success or failure of feature selection in high-dimensional data analysis.
Understanding these limitations is vital for accurate diagnosis and prognosis using biological data.