Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Nov 27, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K

Simple Stopping Criteria for Information Theoretic Feature Selection.

Shujian Yu1, José C Príncipe1

  • 1Computational NeuroEngineering Laboratory, University of Florida, Gainesville, FL 32611, USA.

Entropy (Basel, Switzerland)
|December 3, 2020
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.3K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.3K
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

5.7K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
5.7K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.7K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.7K
Survival Tree01:19

Survival Tree

255
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
255
Woodward–Hoffmann Selection Rules and Microscopic Reversibility01:34

Woodward–Hoffmann Selection Rules and Microscopic Reversibility

3.5K
Electrocyclic reactions, cycloadditions, and sigmatropic rearrangements are concerted pericyclic reactions that proceed via a cyclic transition state. These reactions are stereospecific and regioselective. The stereochemistry of the products depends on the symmetry characteristics of the interacting orbitals and the reaction conditions. Accordingly, pericyclic reactions are classified as either symmetry-allowed or symmetry-forbidden. Woodward and Hoffmann presented the selection criteria for...
3.5K
Difference from Background: Limit of Detection01:05

Difference from Background: Limit of Detection

7.9K
The limit of detection (LOD) is the smallest amount of analyte that can be distinguished from the background noise. The LOD value corresponds to the concentration at which the analyte signal is three times larger than the standard deviation of the blank signal. Below this value, the analyte signal cannot be differentiated from the background noise. It is calculated by dividing the calibration slope by 3 times the standard deviation of the blank signals.
The LOD indicates the presence or absence...
7.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Analysis of Deep Convolutional Neural Networks Using Tensor Kernels and Matrix-Based Entropy.

Entropy (Basel, Switzerland)·2023
Same author

Long-Term Postoperative Pain Prediction Using Higher-Order Singular Value Decomposition of Intraoperative Physiological Responses: Prospective Cohort Study.

JMIR perioperative medicine·2022
Same author

Extracting synchronized neuronal activity from local field potentials based on a marked point process framework.

Journal of neural engineering·2022
Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026
Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026
Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026
Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026
Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026
Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026
See all related articles

This study introduces two novel stopping criteria for information theoretic feature selection, utilizing conditional mutual information (CMI) to optimize feature subsets and improve generalization error. These criteria enhance greedy search strategies by simplifying CMI computation.

Area of Science:

  • Machine Learning
  • Information Theory
  • Data Science

Background:

  • Feature selection is crucial for minimizing generalization error by identifying optimal feature subsets.
  • Information theory-based methods maximize mutual information between features and class labels but face optimization challenges.
  • Determining optimal subset size and effective stopping criteria for greedy feature selection remains an open problem.

Purpose of the Study:

  • To propose two novel stopping criteria for information theoretic feature selection methods.
  • To address the challenge of automatic optimal subset size determination in greedy feature selection.
  • To enhance the practical implementation of information theoretic feature selection.

Main Methods:

  • Developed two stopping criteria based on monitoring conditional mutual information (CMI) among variable groups.
Keywords:
conditional mutual informationfeature selectionmultivariate matrix-based Rényi’s α-entropy functionalstopping criterion

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.8K
Creating Objects and Object Categories for Studying Perception and Perceptual Learning
14:38

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

12.1K

Related Experiment Videos

Last Updated: Nov 27, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.8K
Creating Objects and Object Categories for Studying Perception and Perceptual Learning
14:38

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

12.1K
  • Employed a recently developed multivariate matrix-based Rényi's α-entropy functional for direct data sample estimation.
  • Demonstrated that CMI can be computed without decomposition or approximation, facilitating easy integration.
  • Main Results:

    • The proposed stopping criteria effectively monitor CMI among groups of variables.
    • The multivariate matrix-based Rényi's α-entropy functional allows for direct and accurate CMI estimation from data.
    • The method simplifies CMI computation, making the criteria easy to implement and integrate into existing algorithms.

    Conclusions:

    • The suggested stopping criteria offer a practical solution to optimize feature selection in information theoretic approaches.
    • These criteria can be seamlessly integrated into existing greedy search-based feature selection methods.
    • The findings contribute to more efficient and effective feature selection, improving model generalization.