Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Weighted Mean

Weighted Mean

While taking the arithmetic, geometric, or harmonic mean of a sample data set, equal importance is assigned to all the data points. However, all the values may not always be equally important in some data sets. An intrinsic bias might make it more important to give more weightage to specific values over others.
For example, consider the number of goals scored in the matches of a tournament. While computing the average number of goals scored in the tournament, it may be more important to...

Unusual Results

Unusual Results

Unusual results are those that have a very low chance of occurring. Unusual results can be identified using probabilities and the range rule of thumb. In problems involving probability, unusual results can be observed in 2 instances – an unusually high number of successes or an unusually low number of successes.
According to the range rule of thumb, any value above or below two standard deviations, 2σ from the mean, μ is considered unusual.
Maximum unusual value =...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Censoring Survival Data

Censoring Survival Data

Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

Testing a Claim about Population Proportion

Testing a Claim about Population Proportion

A complete procedure for testing a claim about a population proportion is provided here.
There are two methods of testing a claim about a population proportion: (1) Using the sample proportion from the data where a binomial distribution is approximated to the normal distribution and (2) Using the binomial probabilities calculated from the data.
The first method uses normal distribution as an approximation to the binomial distribution. The requirements are as follows: sample size is large...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Complete mitochondrial genome of the spotted alfalfa aphid, <i>Therioaphis trifolii</i> (Hemipera: Aphididae).

Mitochondrial DNA. Part B, Resources·2020

Same author

The complete mitochondrial genome of the mealy plum aphid, <i>Hyalopterus pruni</i> (Hemiptera: Aphididae).

Mitochondrial DNA. Part B, Resources·2020

Same author

LncRNA PART1 promotes cell proliferation and progression in non-small-cell lung cancer cells via sponging miR-17-5p.

Journal of cellular biochemistry·2020

Same author

Anemoside B4 Protects against Acute Lung Injury by Attenuating Inflammation through Blocking NLRP3 Inflammasome Activation and TLR4 Dimerization.

Journal of immunology research·2020

Same author

Maternal Methyl-Donor Micronutrient Supplementation During Pregnancy Promotes Skeletal Muscle Differentiation and Maturity in Newborn and Weaning Pigs.

Frontiers in nutrition·2020

Same author

Alcohol Abuse and Alcohol Withdrawal Are Associated with Adverse Perioperative Outcomes Following Elective Spine Fusion Surgery.

Spine·2020

Same journal

Deep learning model to predict COPD hospital admissions based on meteorological data: a medical meteorological forecast.

Frontiers in big data·2026

Same journal

Where diverse populations gather: transit accessibility and the spatial structure of social mixing.

Frontiers in big data·2026

Same journal

Inner layer security reinforcement for instant payment systems: a dual layer encryption-steganography evaluation in Brunei's digital payment context.

Frontiers in big data·2026

Same journal

Measuring the impact of virtualization and containerization on the environment when using GPUs for processing the AI models.

Frontiers in big data·2026

Same journal

Using artificial intelligence to improve governance and public services in Africa.

Frontiers in big data·2026

Same journal

Case count metric for comparative analysis of entity resolution results.

Frontiers in big data·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 7, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Weighting Methods for Rare Event Identification From Imbalanced Datasets.

Jia He¹, Maggie X Cheng¹

¹Department of Applied Mathematics, Illinois Institute of Technology, Chicago, IL, United States.

Frontiers in Big Data

|January 10, 2022

Summary

This summary is machine-generated.

This study introduces new weighting algorithms to improve rare event detection in machine learning. These methods effectively handle imbalanced datasets, outperforming existing techniques, especially with noisy data.

Keywords:

bias classification imbalanced dataset machine learning rare event

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Related Experiment Videos

Last Updated: Oct 7, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Area of Science:

Machine Learning
Data Science
Artificial Intelligence

Background:

Machine learning models struggle with imbalanced datasets where rare events are crucial.
Classifiers are often biased towards majority classes, hindering accurate rare event detection.
Existing methods for imbalanced data, like sampling and basic weighting, have limitations.

Purpose of the Study:

To develop novel weighting algorithms for effective rare event detection in machine learning.
To address the challenge of identifying infrequent but important events in large datasets.
To improve the performance of classifiers on imbalanced data, particularly in real-time applications.

Main Methods:

Proposed a boosting-style algorithm for computing class weights with theoretical guarantees.
Developed an adaptive weighting algorithm suitable for real-time network monitoring and similar applications.
Focused on the weighting approach to rebalance class importance without discarding data.

Main Results:

The proposed algorithms demonstrated superior performance compared to existing weighting and boosting methods.
Effectiveness was validated on power grid data and various public imbalanced datasets.
The algorithms showed enhanced superiority when dealing with noisy data, improving rare event identification.

Conclusions:

The novel weighting algorithms offer a robust solution for rare event detection in imbalanced datasets.
The adaptive nature allows for controlled trade-offs between detection rates and false alarms.
These methods provide a significant advancement for applications like network monitoring and anomaly detection.