Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Weighted Mean00:57

Weighted Mean

5.6K
While taking the arithmetic, geometric, or harmonic mean of a sample data set, equal importance is assigned to all the data points. However, all the values may not always be equally important in some data sets. An intrinsic bias might make it more important to give more weightage to specific values over others.
For example, consider the number of goals scored in the matches of a tournament. While computing the average number of goals scored in the tournament, it may be more important to...
5.6K
Unusual Results01:16

Unusual Results

3.4K
Unusual results are those that have a very low chance of occurring. Unusual results can be identified using probabilities and the range rule of thumb. In problems involving probability, unusual results can be observed in 2 instances – an unusually high number of successes or an unusually low number of successes.
According to the range rule of thumb, any value above or below two standard deviations, 2σ  from the mean, μ  is considered unusual.
Maximum unusual value =...
3.4K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.6K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.6K
Censoring Survival Data01:09

Censoring Survival Data

274
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
274
Regression Toward the Mean01:52

Regression Toward the Mean

6.5K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.5K
Testing a Claim about Population Proportion01:24

Testing a Claim about Population Proportion

3.5K
A complete procedure for testing a claim about a population proportion is provided here.
There are two methods of testing a claim about a population proportion: (1) Using the sample proportion from the data where a binomial distribution is approximated to the normal distribution and (2) Using the binomial probabilities calculated from the data.
The first method uses normal distribution as an approximation to the binomial distribution. The requirements are as follows: sample size is large...
3.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Complete mitochondrial genome of the spotted alfalfa aphid, <i>Therioaphis trifolii</i> (Hemipera: Aphididae).

Mitochondrial DNA. Part B, Resources·2020
Same author

The complete mitochondrial genome of the mealy plum aphid, <i>Hyalopterus pruni</i> (Hemiptera: Aphididae).

Mitochondrial DNA. Part B, Resources·2020
Same author

LncRNA PART1 promotes cell proliferation and progression in non-small-cell lung cancer cells via sponging miR-17-5p.

Journal of cellular biochemistry·2020
Same author

Anemoside B4 Protects against Acute Lung Injury by Attenuating Inflammation through Blocking NLRP3 Inflammasome Activation and TLR4 Dimerization.

Journal of immunology research·2020
Same author

Maternal Methyl-Donor Micronutrient Supplementation During Pregnancy Promotes Skeletal Muscle Differentiation and Maturity in Newborn and Weaning Pigs.

Frontiers in nutrition·2020
Same author

Alcohol Abuse and Alcohol Withdrawal Are Associated with Adverse Perioperative Outcomes Following Elective Spine Fusion Surgery.

Spine·2020
Same journal

Deep learning model to predict COPD hospital admissions based on meteorological data: a medical meteorological forecast.

Frontiers in big data·2026
Same journal

Where diverse populations gather: transit accessibility and the spatial structure of social mixing.

Frontiers in big data·2026
Same journal

Inner layer security reinforcement for instant payment systems: a dual layer encryption-steganography evaluation in Brunei's digital payment context.

Frontiers in big data·2026
Same journal

Measuring the impact of virtualization and containerization on the environment when using GPUs for processing the AI models.

Frontiers in big data·2026
Same journal

Using artificial intelligence to improve governance and public services in Africa.

Frontiers in big data·2026
Same journal

Case count metric for comparative analysis of entity resolution results.

Frontiers in big data·2026
See all related articles

Related Experiment Video

Updated: Oct 7, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.7K

Weighting Methods for Rare Event Identification From Imbalanced Datasets.

Jia He1, Maggie X Cheng1

  • 1Department of Applied Mathematics, Illinois Institute of Technology, Chicago, IL, United States.

Frontiers in Big Data
|January 10, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces new weighting algorithms to improve rare event detection in machine learning. These methods effectively handle imbalanced datasets, outperforming existing techniques, especially with noisy data.

Keywords:
biasclassificationimbalanced datasetmachine learningrare event

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.7K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K

Related Experiment Videos

Last Updated: Oct 7, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.7K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.7K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K

Area of Science:

  • Machine Learning
  • Data Science
  • Artificial Intelligence

Background:

  • Machine learning models struggle with imbalanced datasets where rare events are crucial.
  • Classifiers are often biased towards majority classes, hindering accurate rare event detection.
  • Existing methods for imbalanced data, like sampling and basic weighting, have limitations.

Purpose of the Study:

  • To develop novel weighting algorithms for effective rare event detection in machine learning.
  • To address the challenge of identifying infrequent but important events in large datasets.
  • To improve the performance of classifiers on imbalanced data, particularly in real-time applications.

Main Methods:

  • Proposed a boosting-style algorithm for computing class weights with theoretical guarantees.
  • Developed an adaptive weighting algorithm suitable for real-time network monitoring and similar applications.
  • Focused on the weighting approach to rebalance class importance without discarding data.

Main Results:

  • The proposed algorithms demonstrated superior performance compared to existing weighting and boosting methods.
  • Effectiveness was validated on power grid data and various public imbalanced datasets.
  • The algorithms showed enhanced superiority when dealing with noisy data, improving rare event identification.

Conclusions:

  • The novel weighting algorithms offer a robust solution for rare event detection in imbalanced datasets.
  • The adaptive nature allows for controlled trade-offs between detection rates and false alarms.
  • These methods provide a significant advancement for applications like network monitoring and anomaly detection.