Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

Woodward–Hoffmann Selection Rules and Microscopic Reversibility

Woodward–Hoffmann Selection Rules and Microscopic Reversibility

Electrocyclic reactions, cycloadditions, and sigmatropic rearrangements are concerted pericyclic reactions that proceed via a cyclic transition state. These reactions are stereospecific and regioselective. The stereochemistry of the products depends on the symmetry characteristics of the interacting orbitals and the reaction conditions. Accordingly, pericyclic reactions are classified as either symmetry-allowed or symmetry-forbidden. Woodward and Hoffmann presented the selection criteria for...

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A Dataset of Benchmark Boolean Models for Gene Regulatory Networks.

Scientific data·2026

Same author

<i>NECTIN4</i> Amplification Is a Frequent Event in Central Nervous System Metastases of Urothelial Carcinoma.

European urology open science·2026

Same author

Genetic correlation-guided mega-analysis of DO mice provides mechanistic insight and candidate genes for age-related pathologies.

PLoS genetics·2026

Same author

Automated cardiac MRI analysis for robust profiling of heart failure models in mice.

Scientific reports·2025

Same author

Integrated, Cross-Entity Information on Preventive Measures for Bowel, Breast, and Prostate Cancer: Evaluation Study of the Web Application "Prevent-Take-Up".

JMIR cancer·2025

Same author

Identification of ordinal relations and alternative suborders within high-dimensional molecular data.

Frontiers in bioinformatics·2025

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 17, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Efficient cross-validation traversals in feature subset selection.

Ludwig Lausser^1,2, Robin Szekely¹, Florian Schmid¹

¹Institute of Medical Systems Biology, Ulm University, Ulm, Germany.

Scientific Reports

|December 12, 2022

Summary

This summary is machine-generated.

This study introduces efficient methods to reduce computational complexity in feature selection for classification models. This enhances the coverage and efficiency of identifying key predictive patterns in datasets.

More Related Videos

Design and Evaluation of Smart Glasses for Food Intake and Physical Activity Classification

Design and Evaluation of Smart Glasses for Food Intake and Physical Activity Classification

Published on: February 14, 2018

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Related Experiment Videos

Last Updated: Aug 17, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Design and Evaluation of Smart Glasses for Food Intake and Physical Activity Classification

Design and Evaluation of Smart Glasses for Food Intake and Physical Activity Classification

Published on: February 14, 2018

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Area of Science:

Machine Learning
Computational Biology
Bioinformatics

Background:

Sparse and robust classification models identify predictive patterns and generate hypotheses.
Feature selection is crucial but computationally challenging due to large search spaces.
Current methods have limited coverage, even for low-dimensional data.

Purpose of the Study:

To present methods for reducing the computational complexity of feature selection criteria.
To enhance the efficiency and coverage of feature screening in classification.
To enable higher-dimensional analyses by reducing preparation costs.

Main Methods:

Developed methods to reduce computational complexity of feature selection criteria.
Integrated a parallelizable cross-validation traversal strategy with distance-based classifiers.
Methods are compatible with any product distance or kernel.

Main Results:

Achieved significant reduction in computational complexity for high-dimensional subsets.
Demonstrated approximately a 15-fold increase in generating distance matrices for feature combinations.
Evaluated performance, runtime, and fitness landscape on public datasets.

Conclusions:

The proposed methods significantly improve the efficiency and coverage of feature selection.
Enables more comprehensive evaluations, even in low-dimensional settings.
Advances the potential of sparse classification models for pattern discovery and hypothesis generation.