Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Aggregates Classification01:29

Aggregates Classification

289
Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
289
How Data are Classified: Categorical Data01:11

How Data are Classified: Categorical Data

30.8K
A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...
30.8K
Cluster Sampling Method01:20

Cluster Sampling Method

11.5K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
11.5K
Classification of Systems-I01:26

Classification of Systems-I

150
Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:
150
Classification of Systems-II01:31

Classification of Systems-II

119
Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,
119
How Data are Classified: Numerical Data00:59

How Data are Classified: Numerical Data

26.9K
Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...
26.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Synthesis and Luminescence Properties of Eu<sup>2+</sup>-Doped Sr<sub>3</sub>MgSi<sub>2</sub>O<sub>8</sub> Blue Light-Emitting Phosphor for Application in Near-Ultraviolet Excitable White Light-Emitting Diodes.

Nanomaterials (Basel, Switzerland)·2022
Same author

Investigations of a Statistical and Analytical Method to Find the Relationship between the Morphological and Optical Properties of ZnO Nanoflower Arrays.

ACS omega·2022
Same author

Study on the university students' satisfaction of the wisdom tree massive open online course platform based on parameter optimization intelligent algorithm.

Science progress·2021
Same author

Protective effect of <i>Bifidobacterium infantis</i> CGMCC313-2 on ovalbumin-induced airway asthma and β-lactoglobulin-induced intestinal food allergy mouse models.

World journal of gastroenterology·2017
Same author

Loop-mediated isothermal amplification of Neisseria gonorrhoeae porA pseudogene: a rapid and reliable method to detect gonorrhea.

AMB Express·2017
Same author

Efficiently solving general weapon-target assignment problem by genetic algorithms with greedy eugenics.

IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society·2008
Same journal

Therapeutic potential of crude protein extracts from two Egyptian freshwater snails Lanistes carinatus and Bellamya unicolor.

Scientific reports·2026
Same journal

Microbial contamination of donor corneas and post-keratoplasty endophthalmitis: a comparison between Japanese and U.S. eye banks using cold storage.

Scientific reports·2026
Same journal

Prevalence and contributing factors of virological non-suppression among adult patients on first-line antiretroviral therapy in tertiary hospitals in Ethiopia.

Scientific reports·2026
Same journal

An in vitro comparison of color stability between alkasite and different restorative materials in various staining solutions.

Scientific reports·2026
Same journal

Toward accessible mRNA LNP formulation: systematic evaluation of mixing strategies and key parameters.

Scientific reports·2026
Same journal

A network analysis of personality traits, mentalizing, and psychological health in Chinese college students.

Scientific reports·2026
See all related articles

Related Experiment Video

Updated: May 8, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K

Clustering and classification for dry bean feature imbalanced data.

Chou-Yuan Lee1, Wei Wang2, Jian-Qiong Huang3

  • 1School of Big Data, Fuzhou University of International Studies and Trade, Fuzhou, 350202, China. lqy@fzfu.edu.cn.

Scientific Reports
|December 27, 2024
PubMed
Summary
This summary is machine-generated.

This study introduces a novel algorithm combining Borderline-Synthetic Minority Oversampling Technique (BLSMOTE) and K-means clustering to enhance machine learning classification accuracy for imbalanced datasets. The proposed method significantly improves performance metrics like precision and recall.

Keywords:
BLSMOTEDecision treeImbalanced dataK-meansRandom forestSupport vector machine

More Related Videos

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

6.9K
Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons
09:21

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Published on: July 7, 2023

1.4K

Related Experiment Videos

Last Updated: May 8, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K
Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

6.9K
Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons
09:21

Author Spotlight: Generating Neuronal Phenotypic Profiles - A Protocol to Culture and Image Human Midbrain Dopaminergic Neurons

Published on: July 7, 2023

1.4K

Area of Science:

  • Machine Learning
  • Data Science
  • Computer Science

Background:

  • Traditional machine learning models like Decision Trees (DT), Random Forests (RF), and Support Vector Machines (SVM) exhibit limited classification performance on imbalanced datasets.
  • Imbalanced data, where one class significantly outnumbers others, poses a challenge for model training and accurate prediction.
  • Existing methods often struggle to effectively handle class imbalance, leading to biased models and poor generalization.

Purpose of the Study:

  • To develop and evaluate a novel hybrid algorithm for improving classification accuracy on imbalanced datasets.
  • To address the limitations of traditional machine learning algorithms in handling datasets with disparate class distributions.
  • To enhance key performance indicators such as precision, recall, F1-score, and Area Under Curve (AUC).

Main Methods:

  • The proposed algorithm integrates Borderline-Synthetic Minority Oversampling Technique (BLSMOTE) with K-means clustering.
  • BLSMOTE generates synthetic samples on the boundary of the minority class to mitigate noise and improve class representation.
  • K-means clustering groups data points based on similarity, further aiding in data partitioning and model training.

Main Results:

  • The combined BLSMOTE + K-means + SVM algorithm demonstrated superior classification performance compared to traditional methods on the dry bean and obesity levels datasets.
  • BLSMOTE + K-means + DT successfully generated decision rules for both datasets, offering interpretable insights.
  • BLSMOTE + K-means + RF effectively ranked the importance of explanatory variables, providing valuable information for feature selection.

Conclusions:

  • The proposed BLSMOTE + K-means hybrid approach offers a robust solution for enhancing machine learning classification on imbalanced data.
  • This method improves overall predictive accuracy and provides valuable insights through decision rules and variable importance rankings.
  • The findings offer scientific evidence to support decision-making processes in fields dealing with imbalanced datasets.