Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This number is...
Bootstrapping01:24

Bootstrapping

The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is small or...
Survival Tree01:19

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a survival tree begins...
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance, comparing...
Sample Size Calculation01:19

Sample Size Calculation

Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Trio analysis in dystonia identifies de novo KLC1 variants in a kinesinopathy with distinct motor and neurodevelopmental features.

EBioMedicine·2026
Same author

Evaluation of diagnostic concordance between algorithms for Parkinson's disease dementia.

Journal of Parkinson's disease·2026
Same author

Flexible Tactile Sensor System Based on Piezoresistive Layer: Technology and Construction.

Sensors (Basel, Switzerland)·2026
Same author

Atypical Atypical MECP2-Related Rett Syndrome Presenting with Movement Disorders- Predominating Phenotype.

Movement disorders clinical practice·2026
Same author

Is Temporal Variability a Standalone Predictor in Medical Data? An Actigraphy Study in Bipolar Disorder.

Studies in health technology and informatics·2026
Same author

Data Integrity in Medical AI.

Studies in health technology and informatics·2026
Same journal

A novel Milstein-stochastic epidemiologically-informed neural network for approaching epidemic dynamics: Application to Mpox disease.

Computer methods and programs in biomedicine·2026
Same journal

Accounting for approximation errors using surrogate-based parameter estimation of cardiac mechanics digital twins.

Computer methods and programs in biomedicine·2026
Same journal

Facial iPPG heatmap patterns based on period-aware autoencoder show association with carotid atherosclerosis towards non-contact hemodynamic assessment.

Computer methods and programs in biomedicine·2026
Same journal

Explainable machine learning models predict liver fibrosis risk and outcome in the general population: Development and multi-cohort external validation.

Computer methods and programs in biomedicine·2026
Same journal

Evaluation of surrogate endpoints for survival outcomes using the surrogate package in R.

Computer methods and programs in biomedicine·2026
Same journal

Relative spectral and frication-based descriptors as numerical indicators of place of articulation shifts in fricatives produced by Polish children.

Computer methods and programs in biomedicine·2026
See all related articles

Related Experiment Video

Updated: May 23, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Wrapper feature selection for small sample size data driven by complete error estimates.

Martin Macaš1, Lenka Lhotská, Eduard Bakstein

  • 1Czech Technical University, Faculty of Electrical Engineering, Department of Cybernetics, Karlovo Namesti 13, 12135 Prague, Czech Republic.

Computer Methods and Programs in Biomedicine
|April 5, 2012
PubMed
Summary
This summary is machine-generated.

This study introduces a complete bootstrap method for feature selection in 1-nearest neighbor (1NN) classifiers, especially for small biomedical datasets. This novel approach significantly improves accuracy over standard methods.

More Related Videos

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets
03:37

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Related Experiment Videos

Last Updated: May 23, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets
03:37

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Area of Science:

  • Machine Learning
  • Bioinformatics
  • Computational Biology

Background:

  • Wrapper-based feature selection is crucial for high-dimensional data, particularly in biomedical applications with limited sample sizes.
  • 1-nearest neighbor (1NN) classifiers are sensitive to feature relevance, necessitating effective selection methods.
  • Existing methods like standard cross-validation and bootstrap may suffer from high variance in small sample scenarios.

Purpose of the Study:

  • To propose and evaluate a complete bootstrap technique for feature selection in 1NN classifiers.
  • To assess the efficacy of complete bootstrap and complete cross-validation error estimates as selection criteria.
  • To compare these novel criteria against standard methods using various optimization algorithms.

Main Methods:

  • Developed a complete bootstrap method for 1NN classifiers, averaging over all data partitions.
  • Utilized complete bootstrap and complete cross-validation error estimates as novel feature selection criteria.
  • Compared performance against standard 2-fold, 10-fold cross-validation, and bootstrap (50 trials) using Sequential Forward Selection (SFS), Binary Particle Swarm Optimization (BPSO), and Simplified Social Impact Theory based Optimization (SSITO).

Main Results:

  • Complete criteria significantly outperformed standard cross-validation and bootstrap methods across all tested search strategies (SFS, BPSO, SSITO).
  • 1NN wrappers employing complete criteria with SFS demonstrated superior performance compared to FILTER and SIMBA.
  • The proposed methods showed benefits in a real-world application for automatic subthalamic nucleus detection.

Conclusions:

  • The complete bootstrap and complete cross-validation error estimates offer lower variance and superior performance for feature selection in 1NN classifiers, especially with small sample sizes.
  • Complete criterion-based 1NN wrappers, particularly with SFS, are highly effective and recommended for biomedical data analysis.
  • The developed techniques are validated through successful application in detecting the subthalamic nucleus.