Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.
The...

Aggregates Classification

Aggregates Classification

Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a survival tree begins...

Classification of Systems-I

Classification of Systems-I

Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:

Classification of Systems-II

Classification of Systems-II

Continuous-time systems have continuous input and output signals, with time measured continuously. These systems are generally defined by differential or algebraic equations. For instance, in an RC circuit, the relationship between input and output voltage is expressed through a differential equation derived from Ohm's law and the capacitor relation,

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Temporal trends of selected diabetic foot deformities and risk factors: an exploratory analysis from a tertiary diabetes clinic.

Diabetes research and clinical practice·2026

Same author

Non-boundary covariance matrix estimation in generalized linear mixed effects models using data augmentation priors.

Biometrics·2026

Same author

Significance of chondrocyte viability in postmortem interval assessments and chondrocyte viability assay.

International journal of legal medicine·2025

Same author

Evaluation of changes in prediction modelling in biomedicine using systematic reviews.

BMC medical research methodology·2025

Same author

Recommendations for reporting regression-based norms and the development of free-access tools to implement them in practice.

PloS one·2025

Same author

The impact of bias due to exponentiation in the estimation of hazard, risk, and odds ratios: an empirical investigation from 1,495,059 effect sizes from MEDLINE/PubMed abstracts.

BMC medical research methodology·2025

Same journal

SNPio: a Python interface for population genomic data processing.

BMC bioinformatics·2026

Same journal

SpaHNR: a spatial domain identification method via sparse attention-based hierarchical node representation and multi-view contrastive learning.

BMC bioinformatics·2026

Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026

Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Videos

Class prediction for high-dimensional class-imbalanced data.

Rok Blagus¹, Lara Lusa

¹Institute for Biostatistics and Medical Informatics, University of Ljubljana, Vrazov trg 2, Ljubljana, Slovenia.

BMC Bioinformatics

|October 22, 2010

Summary

This summary is machine-generated.

High-dimensional, class-imbalanced data presents significant challenges for accurate sample classification. Standard methods often fail, especially for minority classes, requiring careful assessment and appropriate imbalance handling strategies.

Related Experiment Videos

Area of Science:

Bioinformatics
Computational Biology
Machine Learning

Background:

Class prediction studies aim to create accurate classification rules for new samples.
High-dimensional data, common in fields like genomics, features more variables than samples.
Class-imbalanced data, where sample counts per class differ, frequently complicates standard classification methods, biasing predictions towards the majority class.

Purpose of the Study:

To investigate the challenges high-dimensionality poses for class prediction using imbalanced datasets.
To evaluate the performance of various classifiers on imbalanced, high-dimensional data.
To assess strategies for mitigating the effects of class imbalance in predictive modeling.

Main Methods:

Evaluation of six classifier types on simulated and real-world (breast cancer gene-expression) imbalanced datasets.
Analysis of the impact of variable selection and normalization on classifier performance.
Assessment of over-sampling and down-sizing strategies for addressing class imbalance.

Main Results:

Classifiers demonstrated high sensitivity to class imbalance, with variable selection exacerbating bias towards the majority class.
Down-sizing and asymmetric bagging were effective for mild imbalance, while over-sampling showed limited benefit.
Variable normalization could negatively impact classifier performance.

Conclusions:

Matching training and test set prevalence does not ensure classifier performance with imbalanced, high-dimensional data.
High-dimensionality amplifies the difficulties associated with class-imbalanced data classification.
Researchers must carefully evaluate predictive accuracy and employ appropriate imbalance handling techniques for reliable classification.