Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Stratified Sampling Method01:16

Stratified Sampling Method

13.0K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...
13.0K
Bootstrapping01:24

Bootstrapping

673
The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is...
673
Cluster Sampling Method01:20

Cluster Sampling Method

12.8K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
12.8K
Random Sampling Method01:09

Random Sampling Method

12.5K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
12.5K
Systematic Sampling Method01:17

Systematic Sampling Method

11.2K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
Systematic sampling is one of the simplest methods...
11.2K
Sampling Distribution01:12

Sampling Distribution

13.6K
Given simple random samples of size n from a given population with a measured characteristic such as mean, proportion, or standard deviation for each sample, the probability distribution of all the measured characteristics is called a sampling distribution. How much the statistic varies from one sample to another is known as the sampling variability of a statistic. You typically measure the sampling variability of a statistic by its standard error. The standard error of the mean is an example...
13.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Del Nido versus HTK cardioplegia for myocardial protection during adult complex valve surgery: a retrospective study.

BMC cardiovascular disorders·2021
Same author

miR-141-3p regulates saturated fatty acid-induced cardiomyocyte apoptosis through Notch1/PTEN/AKT pathway via targeting PSEN1.

Environmental toxicology·2021
Same author

Fe<sub>3</sub>O<sub>4</sub>@polydopamine nanoparticle-loaded human umbilical cord mesenchymal stem cells improve the cognitive function in Alzheimer's disease mice by promoting hippocampal neurogenesis.

Nanomedicine : nanotechnology, biology, and medicine·2021
Same author

Preparation of Pseudo-typed H5 Avian Influenza Viruses with Calcium Phosphate Transfection Method and Measurement of Antibody Neutralizing Activity.

Journal of visualized experiments : JoVE·2021
Same author

Corrigendum: Xanthomatous Hypophysitis: A Case Report and Comprehensive Literature Review.

Frontiers in endocrinology·2021
Same author

Single-cell RNA sequencing reveals the landscapes of human cord blood hematopoietic stem cell differentiation during ex vivo culture.

Clinical and translational medicine·2021
Same journal

Turbulent flow in a vortex separator with a directed pipe inlet.

Scientific reports·2026
Same journal

Systematic characteristic evaluation of clay-based cementitious material derived from calcium carbide residue and waste tile powder.

Scientific reports·2026
Same journal

Retraction Note: Improvement of a rapid diagnostic application of monoclonal antibodies against avian influenza H7 subtype virus using Europium nanoparticles.

Scientific reports·2026
Same journal

Applying large language models to spam detection in the Kazakh low-resource language setting.

Scientific reports·2026
Same journal

An open-source 3D printing system enabling in-situ freeze-thaw processing of hydrogels.

Scientific reports·2026
Same journal

An enhanced EfficientNet framework for automated waste classification using cosine annealing and label smoothing.

Scientific reports·2026
See all related articles

Related Experiment Video

Updated: Sep 17, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K

An improved SMOTE algorithm for enhanced imbalanced data classification by expanding sample generation space.

Ying Li1,2, Yali Yang1, Peihua Song3,4

  • 1School of Logistics Management and Engineering, Nanning Normal University, Nanning, 530001, Guangxi, China.

Scientific Reports
|July 2, 2025
PubMed
Summary
This summary is machine-generated.

Class imbalance in datasets hinders model performance. The proposed enhanced Synthetic Minority Over-sampling Technique (SMOTE) generates more realistic synthetic samples, improving classifier accuracy and robustness on imbalanced data.

Keywords:
ClassificationImbalanced datasetsOversamplingSMOTE

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

915

Related Experiment Videos

Last Updated: Sep 17, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

915

Area of Science:

  • Machine Learning
  • Data Science
  • Artificial Intelligence

Background:

  • Class imbalance in datasets significantly degrades classification model performance.
  • Existing over-sampling techniques like SMOTE often fail to preserve local data density and distribution.
  • Improved methods are needed to synthesize samples that better reflect original data characteristics for robust classification.

Purpose of the Study:

  • To introduce an enhanced SMOTE algorithm (ISMOTE) that incorporates local spatial information for synthetic sample generation.
  • To address the limitations of traditional SMOTE in handling local data distribution and density distortions.
  • To improve the robustness and performance of classification models on imbalanced datasets.

Main Methods:

  • Proposing ISMOTE, which modifies spatial constraints for generating synthetic samples.
  • Generating a base sample between two original samples and using Euclidean distance to create new samples.
  • Adaptively expanding the synthetic sample generation space to better preserve local data distribution.

Main Results:

  • Comparative analysis with seven over-sampling algorithms on thirteen public datasets.
  • ISMOTE demonstrated more realistic data distributions in 2D and 3D scatter plots.
  • Significant improvements in classifier performance: F1-score (+13.07%), G-mean (+16.55%), and AUC (+7.94%).

Conclusions:

  • ISMOTE effectively alleviates distortions in local data distribution and density.
  • The algorithm shows parameter adaptability for multi-class imbalanced datasets.
  • ISMOTE offers a superior approach to handling class imbalance for improved machine learning model performance.