Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Prediction Intervals01:03

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
The...
Aggregates Classification01:29

Aggregates Classification

Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
Survival Tree01:19

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a survival tree begins...
Classification of Systems-I01:26

Classification of Systems-I

Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:
Classification of Systems-II01:31

Classification of Systems-II

Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,
Classification of Signals01:30

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Temporal trends of selected diabetic foot deformities and risk factors: an exploratory analysis from a tertiary diabetes clinic.

Diabetes research and clinical practice·2026
Same author

Non-boundary covariance matrix estimation in generalized linear mixed effects models using data augmentation priors.

Biometrics·2026
Same author

Significance of chondrocyte viability in postmortem interval assessments and chondrocyte viability assay.

International journal of legal medicine·2025
Same author

Evaluation of changes in prediction modelling in biomedicine using systematic reviews.

BMC medical research methodology·2025
Same author

Recommendations for reporting regression-based norms and the development of free-access tools to implement them in practice.

PloS one·2025
Same author

The impact of bias due to exponentiation in the estimation of hazard, risk, and odds ratios: an empirical investigation from 1,495,059 effect sizes from MEDLINE/PubMed abstracts.

BMC medical research methodology·2025
Same journal

SNPio: a Python interface for population genomic data processing.

BMC bioinformatics·2026
Same journal

SpaHNR: a spatial domain identification method via sparse attention-based hierarchical node representation and multi-view contrastive learning.

BMC bioinformatics·2026
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
See all related articles

Related Experiment Videos

Class prediction for high-dimensional class-imbalanced data.

Rok Blagus1, Lara Lusa

  • 1Institute for Biostatistics and Medical Informatics, University of Ljubljana, Vrazov trg 2, Ljubljana, Slovenia.

BMC Bioinformatics
|October 22, 2010
PubMed
Summary
This summary is machine-generated.

High-dimensional, class-imbalanced data presents significant challenges for accurate sample classification. Standard methods often fail, especially for minority classes, requiring careful assessment and appropriate imbalance handling strategies.

Related Experiment Videos

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Machine Learning

Background:

  • Class prediction studies aim to create accurate classification rules for new samples.
  • High-dimensional data, common in fields like genomics, features more variables than samples.
  • Class-imbalanced data, where sample counts per class differ, frequently complicates standard classification methods, biasing predictions towards the majority class.

Purpose of the Study:

  • To investigate the challenges high-dimensionality poses for class prediction using imbalanced datasets.
  • To evaluate the performance of various classifiers on imbalanced, high-dimensional data.
  • To assess strategies for mitigating the effects of class imbalance in predictive modeling.

Main Methods:

  • Evaluation of six classifier types on simulated and real-world (breast cancer gene-expression) imbalanced datasets.
  • Analysis of the impact of variable selection and normalization on classifier performance.
  • Assessment of over-sampling and down-sizing strategies for addressing class imbalance.

Main Results:

  • Classifiers demonstrated high sensitivity to class imbalance, with variable selection exacerbating bias towards the majority class.
  • Down-sizing and asymmetric bagging were effective for mild imbalance, while over-sampling showed limited benefit.
  • Variable normalization could negatively impact classifier performance.

Conclusions:

  • Matching training and test set prevalence does not ensure classifier performance with imbalanced, high-dimensional data.
  • High-dimensionality amplifies the difficulties associated with class-imbalanced data classification.
  • Researchers must carefully evaluate predictive accuracy and employ appropriate imbalance handling techniques for reliable classification.