Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Bias

Bias

Bias refers to any tendency that prevents a question from being considered unprejudiced. In research, bias occurs when one outcome or answer is selected or encouraged over others in sampling or testing. Bias can occur during any research phase, including study design, data collection, analysis, and publication.
In statistics, a sampling bias is created when a sample is collected from a population, and some members of the population are not as likely to be chosen as others (remember, each member...

Sensitivity, Specificity, and Predicted Value

Sensitivity, Specificity, and Predicted Value

In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026

Same author

Incorporating external risk information with the Cox model under population heterogeneity: applications to trans-ancestry polygenic hazard scores.

Journal of the Royal Statistical Society. Series A, (Statistics in Society)·2026

Same author

Detecting Test Speededness Using Responses and/or Response Times: Change Point Analysis Approaches Based on Schwarz Information Criterion.

Psychometrika·2026

Same author

Using multilabel classification neural network to detect intersectional DIF with small sample sizes.

The British journal of mathematical and statistical psychology·2026

Same author

A multi-strategy cognitive diagnosis model based on response times and fixation counts.

Behavior research methods·2026

Same author

Regularized Variational Estimation for Exploratory Item Factor Analysis.

Psychometrika·2026

Same journal

Bayesian evaluation for latent variable models: A tutorial on computing information criteria and bayes factors with the r package bleval.

Psychological methods·2026

Same journal

A stochastic block prior for clustering in graphical models.

Psychological methods·2026

Same journal

Three-level vector autoregressive models.

Psychological methods·2026

Same journal

Scaling cognitive modeling to big data: A deep learning approach to studying individual differences in evidence accumulation model parameters.

Psychological methods·2026

Same journal

Best practices in multilevel modeling for within-cluster group comparisons: An evaluation of coding strategies reflecting group composition and heterogeneity.

Psychological methods·2026

Same journal

A unified framework for psychometrics in experimental psychology: The standardized generalized hierarchical factor model.

Psychological methods·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 11, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Constructing a binary prediction model with incomplete data: Variable selection to balance fairness and precision.

He Ren¹, Chun Wang¹, Gongjun Xu²

¹Measurement and Statistics Program, College of Education, University of Washington.

Psychological Methods

|August 14, 2025

Summary

This summary is machine-generated.

This study compares two variable selection methods, bootstrap imputation-stability selection (BI-SS) and stacked elastic net (SENET), for incomplete data. BI-SS is recommended for nested data designs, offering better performance in complex models.

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Related Experiment Videos

Last Updated: Sep 11, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Area of Science:

Psychology
Statistics
Data Science

Background:

The tension between explanation and prediction in psychology necessitates robust variable selection.
Missing data complicate standard variable selection and penalized regression methods.
Evaluating model performance requires considering both prediction accuracy and fairness, especially for societal implications.

Purpose of the Study:

To explore and compare two methods for variable selection with incomplete data: bootstrap imputation-stability selection (BI-SS) and stacked elastic net (SENET).
To evaluate the performance of BI-SS and SENET using prediction accuracy and fairness metrics across various simulation complexities.
To assess the suitability of these methods for generalized linear models and nested data designs.

Main Methods:

Employed bootstrap imputation-stability selection (BI-SS) on multiply imputed datasets, aggregating results via stability selection.
Utilized stacked elastic net (SENET) by stacking imputed datasets for a single pooled model fit.
Conducted three simulation studies with increasing complexity, evaluating performance with metrics like AUC, F1 score, and fairness criteria.

Main Results:

BI-SS and SENET showed comparable performance for generalized linear models.
BI-SS demonstrated superior performance in nested data designs due to computational demands of SENET with mixed-effects models.
Both methods were successfully demonstrated on electronic health data.

Conclusions:

BI-SS is a robust method for variable selection with incomplete data, particularly advantageous in nested data structures.
The choice between BI-SS and SENET may depend on the data structure and model complexity.
Further application in real-world datasets like electronic health records is validated.