Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Prediction Intervals01:03

Prediction Intervals

2.3K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
2.3K
Survival Tree01:19

Survival Tree

159
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
159
Bias01:22

Bias

4.9K
Bias refers to any tendency that prevents a question from being considered unprejudiced. In research, bias occurs when one outcome or answer is selected or encouraged over others in sampling or testing. Bias can occur during any research phase, including study design, data collection, analysis, and publication.
In statistics, a sampling bias is created when a sample is collected from a population, and some members of the population are not as likely to be chosen as others (remember, each member...
4.9K
Sensitivity, Specificity, and Predicted Value01:13

Sensitivity, Specificity, and Predicted Value

661
In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...
661
Multiple Regression01:25

Multiple Regression

3.2K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.2K
Regression Toward the Mean01:52

Regression Toward the Mean

6.5K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026
Same author

Incorporating external risk information with the Cox model under population heterogeneity: applications to trans-ancestry polygenic hazard scores.

Journal of the Royal Statistical Society. Series A, (Statistics in Society)·2026
Same author

Detecting Test Speededness Using Responses and/or Response Times: Change Point Analysis Approaches Based on Schwarz Information Criterion.

Psychometrika·2026
Same author

Using multilabel classification neural network to detect intersectional DIF with small sample sizes.

The British journal of mathematical and statistical psychology·2026
Same author

A multi-strategy cognitive diagnosis model based on response times and fixation counts.

Behavior research methods·2026
Same author

Regularized Variational Estimation for Exploratory Item Factor Analysis.

Psychometrika·2026
Same journal

Bayesian evaluation for latent variable models: A tutorial on computing information criteria and bayes factors with the r package bleval.

Psychological methods·2026
Same journal

A stochastic block prior for clustering in graphical models.

Psychological methods·2026
Same journal

Three-level vector autoregressive models.

Psychological methods·2026
Same journal

Scaling cognitive modeling to big data: A deep learning approach to studying individual differences in evidence accumulation model parameters.

Psychological methods·2026
Same journal

Best practices in multilevel modeling for within-cluster group comparisons: An evaluation of coding strategies reflecting group composition and heterogeneity.

Psychological methods·2026
Same journal

A unified framework for psychometrics in experimental psychology: The standardized generalized hierarchical factor model.

Psychological methods·2026
See all related articles

Related Experiment Video

Updated: Sep 11, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K

Constructing a binary prediction model with incomplete data: Variable selection to balance fairness and precision.

He Ren1, Chun Wang1, Gongjun Xu2

  • 1Measurement and Statistics Program, College of Education, University of Washington.

Psychological Methods
|August 14, 2025
PubMed
Summary
This summary is machine-generated.

This study compares two variable selection methods, bootstrap imputation-stability selection (BI-SS) and stacked elastic net (SENET), for incomplete data. BI-SS is recommended for nested data designs, offering better performance in complex models.

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

894

Related Experiment Videos

Last Updated: Sep 11, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

894

Area of Science:

  • Psychology
  • Statistics
  • Data Science

Background:

  • The tension between explanation and prediction in psychology necessitates robust variable selection.
  • Missing data complicate standard variable selection and penalized regression methods.
  • Evaluating model performance requires considering both prediction accuracy and fairness, especially for societal implications.

Purpose of the Study:

  • To explore and compare two methods for variable selection with incomplete data: bootstrap imputation-stability selection (BI-SS) and stacked elastic net (SENET).
  • To evaluate the performance of BI-SS and SENET using prediction accuracy and fairness metrics across various simulation complexities.
  • To assess the suitability of these methods for generalized linear models and nested data designs.

Main Methods:

  • Employed bootstrap imputation-stability selection (BI-SS) on multiply imputed datasets, aggregating results via stability selection.
  • Utilized stacked elastic net (SENET) by stacking imputed datasets for a single pooled model fit.
  • Conducted three simulation studies with increasing complexity, evaluating performance with metrics like AUC, F1 score, and fairness criteria.

Main Results:

  • BI-SS and SENET showed comparable performance for generalized linear models.
  • BI-SS demonstrated superior performance in nested data designs due to computational demands of SENET with mixed-effects models.
  • Both methods were successfully demonstrated on electronic health data.

Conclusions:

  • BI-SS is a robust method for variable selection with incomplete data, particularly advantageous in nested data structures.
  • The choice between BI-SS and SENET may depend on the data structure and model complexity.
  • Further application in real-world datasets like electronic health records is validated.