Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

How Data are Classified: Categorical Data01:11

How Data are Classified: Categorical Data

32.8K
A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...
32.8K
Classification of Systems-I01:26

Classification of Systems-I

188
Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:
188
Classification of Systems-II01:31

Classification of Systems-II

146
Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,
146
How Data are Classified: Numerical Data00:59

How Data are Classified: Numerical Data

28.6K
Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...
28.6K
Aggregates Classification01:29

Aggregates Classification

326
Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
326
Classification of Signals01:30

Classification of Signals

466
In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...
466

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Pituitary Apoplexy with the Initial Presentations Similar to Trigeminal Autonomic Cephalalgia: A Case Report.

Acta neurologica Taiwanica·2026
Same author

Prolonged Conscious Disturbance in a Patient with Neuronal Intranuclear Inclusion Disease Masquerading as Hashimoto Encephalopathy: A Case Report.

Acta neurologica Taiwanica·2026
Same author

Landau-Kleffner Syndrome with Adult-onset Epilepsy: A Case Report.

Acta neurologica Taiwanica·2026
Same author

Cytokine-induced memory-like natural killer cells in systemic lupus erythematosus patients.

Advances in rheumatology (London, England)·2025
Same author

Distribution-free control charts for mixed-type data based on rank of interpoint distances.

Statistical methods in medical research·2025
Same author

An enhanced EWMA chart with variable sampling interval scheme for monitoring the exponential process with estimated parameter.

Scientific reports·2024
Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026
Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026
Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026
Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026
Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026
Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Jul 5, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K

Binary Classification with Imbalanced Data.

Jyun-You Chiang1, Yuhlong Lio2, Chien-Ya Hsu3

  • 1School of Statistics, Southwestern University of Finance and Economics, Chengdu 611130, China.

Entropy (Basel, Switzerland)
|January 22, 2024
PubMed
Summary
This summary is machine-generated.

This study introduces an expectation-maximization (EM) algorithm for zero-inflated Bernoulli (ZIBer) models with imbalanced data. LightGBM and ZIBer models show competitive predictive performance against artificial neural networks (ANNs) for such datasets.

Keywords:
Entropyartificial neural networkexpectation-maximization algorithmlogistic regressionzero-inflated model

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.5K
Flying Insect Detection and Classification with Inexpensive Sensors
05:16

Flying Insect Detection and Classification with Inexpensive Sensors

Published on: October 15, 2014

25.2K

Related Experiment Videos

Last Updated: Jul 5, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.5K
Flying Insect Detection and Classification with Inexpensive Sensors
05:16

Flying Insect Detection and Classification with Inexpensive Sensors

Published on: October 15, 2014

25.2K

Area of Science:

  • Statistics
  • Machine Learning
  • Computational Statistics

Background:

  • Imbalanced data, characterized by an excess of zero counts in the response variable, pose significant challenges for binary classification tasks.
  • Existing methods struggle with accurate parameter estimation and prediction when dealing with zero-inflated and imbalanced datasets.

Purpose of the Study:

  • To propose an expectation-maximization (EM) algorithm for simplifying the computation of maximum likelihood estimators (MLEs) for zero-inflated Bernoulli (ZIBer) model parameters with imbalanced data.
  • To compare the predictive performance of the ZIBer model against popular machine learning algorithms like LightGBM and artificial neural networks (ANNs) using Monte Carlo simulations.

Main Methods:

  • Development of an expectation-maximization (EM) algorithm to efficiently derive MLEs for ZIBer model parameters.
  • Implementation of a logistic regression model to link Bernoulli probabilities with covariates within the ZIBer framework.
  • Comparative analysis using Monte Carlo simulations to evaluate prediction performance across ZIBer, LightGBM, and ANN models.

Main Results:

  • No single method demonstrated consistent dominance across all scenarios for predictive performance on imbalanced data.
  • The zero-inflated Bernoulli (ZIBer) model and LightGBM exhibited more competitive predictive capabilities compared to the artificial neural network (ANN) model.
  • The proposed EM algorithm effectively simplifies parameter estimation for ZIBer models with imbalanced data.

Conclusions:

  • For zero-inflated imbalanced datasets, the ZIBer model and LightGBM offer robust predictive performance, outperforming ANNs in certain contexts.
  • The choice of model for imbalanced binary classification should consider the specific characteristics of the data, as no universal best method exists.
  • The developed EM algorithm provides an efficient computational approach for parameter estimation in ZIBer models, particularly beneficial for imbalanced data scenarios.