Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Bootstrapping01:24

Bootstrapping

719
The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is...
719
Distributions to Estimate Population Parameter01:26

Distributions to Estimate Population Parameter

4.8K
The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...
4.8K
Choosing Between z and t Distribution01:25

Choosing Between z and t Distribution

3.4K
The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...
3.4K
Binomial Probability Distribution01:15

Binomial Probability Distribution

14.6K
A binomial distribution is a probability distribution for a procedure with a fixed number of trials, where each trial can have only two outcomes.
The outcomes of a binomial experiment fit a binomial probability distribution. A statistical experiment can be classified as a binomial experiment if the following conditions are met:
There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.
There are only two possible outcomes,...
14.6K
Parametric Survival Analysis: Weibull and Exponential Methods01:14

Parametric Survival Analysis: Weibull and Exponential Methods

830
Parametric survival analysis models survival data by assuming a specific probability distribution for the time until an event occurs. The Weibull and exponential distributions are two of the most commonly used methods in this context, due to their versatility and relatively straightforward application.
Weibull Distribution
The Weibull distribution is a flexible model used in parametric survival analysis. It can handle both increasing and decreasing hazard rates, depending on its shape parameter...
830
One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation01:24

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

916
This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...
916

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same authorSame journal

Sequential Gibbs posteriors with applications to principal component analysis.

Biometrika·2026
Same author

Enhancing the Interfacial Adhesion by a Novel Benzofuran-Substituted Self-Assembled Molecules for Thermal Cycle Stable Perovskite Solar Cells and Modules.

Small (Weinheim an der Bergstrasse, Germany)·2026
Same author

Scalable and robust regression models for continuous proportional data.

Journal of the American Statistical Association·2026
Same author

Local graph estimation with pathwise false discovery control.

Nature communications·2026
Same author

Integration of Cervical Length, Inflammatory Marker, and Vaginal Biomarkers (PAMG-1 and fFN) in the Diagnosis of Threatened Preterm Labor.

Iranian journal of allergy, asthma, and immunology·2026
Same author

Multifunctional Additives Suppressed Phase Segregation of Wide-Bandgap Perovskites for Semitransparent Solar Cells.

ChemSusChem·2026
Same journal

Individualized dynamic latent factor model for multi-resolutional data with application to mobile health.

Biometrika·2026
Same journal

Functional principal component analysis forsparse censored data.

Biometrika·2026
Same journal

Finding distributions that differ, with false discovery rate control.

Biometrika·2026
Same journal

Comparing causal parameters with many treatments and positivity violations.

Biometrika·2026
Same journal

Leveraging External Data for Testing Experimental Therapies with Biomarker Interactions in Randomized Clinical Trials.

Biometrika·2026
See all related articles

Related Experiment Video

Updated: Nov 21, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.9K

Efficient posterior sampling for high-dimensional imbalanced logistic regression.

Deborshee Sen1, Matthias Sachs2, Jianfeng Lu2

  • 1Department of Statistical Science, Duke University, Box 90251, Durham, North Carolina 27708, U.S.A.

Biometrika
|January 19, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces improved Bayesian classification methods for high-dimensional, imbalanced data. New algorithms enhance computational efficiency and accuracy, outperforming existing techniques in simulations and cancer data analysis.

Keywords:
Imbalanced dataLogistic regressionPiecewise-deterministic Markov processScalable inferenceSubsampling

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.8K

Related Experiment Videos

Last Updated: Nov 21, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.9K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.8K

Area of Science:

  • Statistics
  • Machine Learning
  • Computational Biology

Background:

  • High-dimensional data classification is crucial but challenging, especially with imbalanced datasets.
  • Current Bayesian classification methods using Markov chain Monte Carlo (MCMC) are computationally inefficient for large datasets due to slow mixing rates and high computational cost per step.
  • Standard subsampling techniques for efficiency fail with imbalanced data.

Purpose of the Study:

  • To develop efficient Bayesian classification algorithms for high-dimensional and imbalanced data.
  • To overcome the computational limitations of traditional MCMC methods in large-scale Bayesian classification.
  • To address the breakdown of standard subsampling in imbalanced data scenarios.

Main Methods:

  • Generalization of piecewise-deterministic Markov chain Monte Carlo (PD-MCMC) algorithms.
  • Incorporation of importance-weighted and mini-batch subsampling strategies.
  • Theoretical analysis and validation through simulated data and a real-world cancer dataset.

Main Results:

  • The proposed generalized PD-MCMC algorithms maintain correct stationary distributions even with small subsamples.
  • These novel methods demonstrate substantial performance gains over existing competitors.
  • The approach shows effectiveness in both simulated scenarios and practical application to cancer data.

Conclusions:

  • The developed importance-weighted and mini-batch subsampling for PD-MCMC offers a robust solution for Bayesian classification with high-dimensional, imbalanced data.
  • This approach significantly improves computational efficiency and classification accuracy.
  • The methods provide a valuable tool for analyzing complex biological datasets, such as cancer data.