Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Classification of Systems-I

Classification of Systems-I

Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:

Classification of Systems-II

Classification of Systems-II

Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,

Aggregates Classification

Aggregates Classification

Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...

How Data are Classified: Categorical Data

How Data are Classified: Categorical Data

A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

Classification of Leukocytes

Classification of Leukocytes

Leukocytes are classified into two groups based on the presence or absence of cytoplasmic granules. Granular leukocytes, which contain granules, belong to the myeloid lineage and are divided into three subtypes: neutrophils, eosinophils, and basophils. These cells are roughly spherical and characterized by the granules in their cytoplasm.
Neutrophils are the most abundant type of granular leukocytes, comprising 50-70% of all leukocytes. They feature small, evenly distributed granules and a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Contrast use and radiation exposure during transcatheter aortic valve implantation according to valve design.

Kardiologia polska·2026

Same author

Simultaneous transcatheter edge-to-edge repair (TEER) for severe mitral and tricuspid regurgitation is feasible, safe, and associated with good clinical outcome.

PloS one·2026

Same author

Electrophysiological Correlates for the Detection of Haptic Illusions.

IEEE transactions on haptics·2025

Same author

Long-Term Follow-Up After Direct-Flow Transcatheter Aortic Valve Implantation: A Single Center Experience.

Catheterization and cardiovascular interventions : official journal of the Society for Cardiac Angiography & Interventions·2025

Same author

4D flow MRI-based grading of left ventricular diastolic dysfunction: a validation study against echocardiography.

European radiology·2025

Same author

Long-term structural valve deterioration after TAVI: insights from the EORP ESC Valve Durability TAVI Registry.

EuroIntervention : journal of EuroPCR in collaboration with the Working Group on Interventional Cardiology of the European Society of Cardiology·2025

Same journal

Characterization of genomic diversity in bacteriophages infecting Rhodococcus.

PloS one·2026

Same journal

Effectiveness of the Responding to Experienced and Anticipated Discrimination (READ) training on reducing stigma for medical students in Tunisia.

PloS one·2026

Same journal

Cell-cell junction gene signatures as subtype-specific prognostic biomarkers in breast cancer.

PloS one·2026

Same journal

GC-MS based tentative identification of γ-sitosterol from Brassica nigra seeds and evaluation of its anticancer potential: An integrated in vitro and in silico study.

PloS one·2026

Same journal

Ad-based social media interventions increase belief accuracy and generate pro-social opinions among non-news readers.

PloS one·2026

Same journal

Negotiating knowledge: The role of network hedging in the production of high-impact science.

PloS one·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 25, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A refined approach for evaluating small datasets via binary classification using machine learning.

Steffen Steinert^1,2, Verena Ruf¹, David Dzsotjan¹

¹Chair of Physics Education, Ludwig-Maximilians-Universität München (LMU Munich), Munich, Germany.

|May 21, 2024

Summary

This summary is machine-generated.

Machine learning analysis of small datasets in education research requires careful evaluation. This study introduces a refined approach using permutation tests and nested cross-validation to ensure reliable, unbiased results for binary classification tasks.

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Design and Analysis for Fall Detection System Simplification

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

Related Experiment Videos

Last Updated: Jun 25, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Design and Analysis for Fall Detection System Simplification

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

Area of Science:

Machine Learning
Statistical Analysis
Educational Research

Background:

Classical statistical methods are often complemented or replaced by machine learning (ML).
Small datasets, common in fields like education research, pose challenges related to bias and spurious findings.
Evaluating ML performance on limited data requires specialized techniques to ensure reliability.

Purpose of the Study:

To present a refined methodology for evaluating binary classification performance using ML on small datasets.
To address issues of bias and chance in ML model evaluation within data-limited research contexts.
To provide guidelines for selecting appropriate evaluation metrics for small-dataset ML applications.

Main Methods:

Implementation of a non-parametric permutation test to assess the generalizability of ML model results.
Utilization of repeated nested cross-validation for bias-free and reliable performance estimation.
Comparative analysis of various evaluation metrics, including the Matthews correlation coefficient.

Main Results:

Repeated nested cross-validation demonstrates minimal bias and high reliability, with results largely independent of chance.
The permutation test effectively quantifies the probability of results generalizing to new, unseen data.
The Matthews correlation coefficient is identified as a robust metric for binary classification when classes have equal importance, showing low bias and chance of coincidental success.

Conclusions:

A combination of evaluation metrics is recommended for training and assessing ML classifiers to leverage their respective strengths.
The proposed approach, incorporating permutation tests and nested cross-validation, is crucial for accurate ML analysis of small datasets.
Avoiding biases is paramount when applying machine learning techniques to small datasets, particularly in sensitive research areas like education.