Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Nov 27, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Simple Stopping Criteria for Information Theoretic Feature Selection.

Shujian Yu¹, José C Príncipe¹

¹Computational NeuroEngineering Laboratory, University of Florida, Gainesville, FL 32611, USA.

Entropy (Basel, Switzerland)

|December 3, 2020

Summary

This summary is machine-generated.

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Woodward–Hoffmann Selection Rules and Microscopic Reversibility

Woodward–Hoffmann Selection Rules and Microscopic Reversibility

Electrocyclic reactions, cycloadditions, and sigmatropic rearrangements are concerted pericyclic reactions that proceed via a cyclic transition state. These reactions are stereospecific and regioselective. The stereochemistry of the products depends on the symmetry characteristics of the interacting orbitals and the reaction conditions. Accordingly, pericyclic reactions are classified as either symmetry-allowed or symmetry-forbidden. Woodward and Hoffmann presented the selection criteria for...

Difference from Background: Limit of Detection

Difference from Background: Limit of Detection

The limit of detection (LOD) is the smallest amount of analyte that can be distinguished from the background noise. The LOD value corresponds to the concentration at which the analyte signal is three times larger than the standard deviation of the blank signal. Below this value, the analyte signal cannot be differentiated from the background noise. It is calculated by dividing the calibration slope by 3 times the standard deviation of the blank signals.
The LOD indicates the presence or absence...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy.

Entropy (Basel, Switzerland)·2023

Same author

Long-Term Postoperative Pain Prediction Using Higher-Order Singular Value Decomposition of Intraoperative Physiological Responses: Prospective Cohort Study.

JMIR perioperative medicine·2022

Same author

Extracting synchronized neuronal activity from local field potentials based on a marked point process framework.

Journal of neural engineering·2022

Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026

Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026

Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026

Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026

Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026

Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026

See all related articles

This study introduces two novel stopping criteria for information theoretic feature selection, utilizing conditional mutual information (CMI) to optimize feature subsets and improve generalization error. These criteria enhance greedy search strategies by simplifying CMI computation.

Area of Science:

Machine Learning
Information Theory
Data Science

Background:

Feature selection is crucial for minimizing generalization error by identifying optimal feature subsets.
Information theory-based methods maximize mutual information between features and class labels but face optimization challenges.
Determining optimal subset size and effective stopping criteria for greedy feature selection remains an open problem.

Purpose of the Study:

To propose two novel stopping criteria for information theoretic feature selection methods.
To address the challenge of automatic optimal subset size determination in greedy feature selection.
To enhance the practical implementation of information theoretic feature selection.

Main Methods:

Developed two stopping criteria based on monitoring conditional mutual information (CMI) among variable groups.

Keywords:

conditional mutual information feature selection multivariate matrix-based Rényi’s α-entropy functional stopping criterion

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

Related Experiment Videos

Last Updated: Nov 27, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

Employed a recently developed multivariate matrix-based Rényi's α-entropy functional for direct data sample estimation.

Demonstrated that CMI can be computed without decomposition or approximation, facilitating easy integration.

Main Results:

The proposed stopping criteria effectively monitor CMI among groups of variables.
The multivariate matrix-based Rényi's α-entropy functional allows for direct and accurate CMI estimation from data.
The method simplifies CMI computation, making the criteria easy to implement and integrate into existing algorithms.

Conclusions:

The suggested stopping criteria offer a practical solution to optimize feature selection in information theoretic approaches.
These criteria can be seamlessly integrated into existing greedy search-based feature selection methods.
The findings contribute to more efficient and effective feature selection, improving model generalization.