Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Aggregates Classification01:29

Aggregates Classification

970
Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
970
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.5K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.5K
Gaussian Elimination: Problem Solving01:30

Gaussian Elimination: Problem Solving

162
Systems of linear equations in several variables are pivotal in modeling complex scenarios involving multiple unknowns and constraints. Such systems are widely used in various fields to represent relationships where several conditions must be simultaneously satisfied. Each variable in the system corresponds to an unknown quantity, while each equation imposes a linear constraint, leading to a structured approach for analyzing and solving real-world problems.A system of three equations with three...
162
Improving Translational Accuracy02:07

Improving Translational Accuracy

14.1K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
14.1K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.5K
3.5K
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

7.2K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
7.2K
  1. Home
  2. Improving Gaussian Naive Bayes Classification On Imbalanced Data Through Coordinate-based Minority Feature Mining.
  1. Home
  2. Improving Gaussian Naive Bayes Classification On Imbalanced Data Through Coordinate-based Minority Feature Mining.

Related Experiment Video

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K

Improving Gaussian Naive Bayes classification on imbalanced data through coordinate-based minority feature mining.

Wei Wang1, Li Yan1, Fen Liu1

  • 1School of Business, Guilin Tourism University, Guilin, Guangxi, China.

Peerj. Computer Science
|September 24, 2025

View abstract on PubMed

Summary
This summary is machine-generated.

This study introduces a novel coordinate transformation algorithm to improve Gaussian Naive Bayes (GNB) classification performance on imbalanced data. The radial local relative density changes (RLDC) method enhances minority class representation without altering data distribution, outperforming traditional sampling techniques.

Keywords:
Coordinate transformationGaussian Naive Bayes classifierImbalanced dataMinority class salient features

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.2K

Related Experiment Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.2K

Area of Science:

  • Machine Learning
  • Data Science
  • Artificial Intelligence

Background:

  • Gaussian Naive Bayes (GNB) classifiers struggle with imbalanced datasets, leading to performance degradation.
  • Existing sampling techniques for imbalanced data alter data distribution and can cause overfitting or class overlap.
  • There is a need for methods that improve GNB performance on imbalanced data without modifying the original dataset.

Purpose of the Study:

  • To propose a novel coordinate transformation algorithm based on radial local relative density changes (RLDC).
  • To enhance GNB classification performance on imbalanced datasets by generating new features.
  • To preserve the original data's quantity and distribution while improving minority class representation.

Main Methods:

  • Developed a coordinate transformation algorithm that converts absolute coordinates to RLDC-relative coordinates.
  • The RLDC transformation reveals latent local relative density change features, highlighting minority class patterns.
  • Applied the transformed features to the GNB classifier to improve minority class probability estimation.
  • Main Results:

    • The RLDC-based coordinate transformation algorithm significantly improved GNB performance on 20 imbalanced datasets.
    • The algorithm outperformed 14 traditional sampling algorithms across three classification evaluation metrics.
    • Achieved average performance improvements of 21.84%, 33.45%, and 54.63% compared to existing methods.

    Conclusions:

    • The RLDC coordinate transformation offers a novel and effective approach to handling imbalanced data in GNB classification.
    • This method enhances classification accuracy by creating informative features without altering the original data.
    • The algorithm demonstrates significant theoretical and practical value for imbalanced classification problems.