Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Variability: Analysis

Variability: Analysis

Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Variation

Variation

An important characteristic of any set of data is the variation in the data. In some data sets, the data values are concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation, which is the square root of variance.
When independent and dependent variables are plotted on a scatter plot, the slope of a line is a value that describes the rate of change between the two...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A pilot study of magnetic resonance fingerprinting and radiomics analysis in autosomal dominant polycystic kidney disease.

Kidney international·2026

Same author

Creation of a Merged Harmonized Large Database of Subject-Level Data from Acute Secondary Prevention Studies in Minor Non-Cardioembolic Stroke or TIA.

International journal of cerebrovascular disease and stroke·2026

Same author

POLQ-driven repair scars shape the immunogenic landscape of homologous recombination-deficient pancreatic cancer.

bioRxiv : the preprint server for biology·2026

Same author

Door-in-door-out times and outcomes in patients with acute ischaemic stroke transferred for endovascular therapy in the USA: a retrospective cohort study.

The Lancet. Neurology·2026

Same author

Letting Biology Validate the EPR Imaging of Tumor pO<sub>2</sub>.

Advances in experimental medicine and biology·2026

Same author

Impact of fine particulate matter air pollution (PM<sub>2·5</sub>) on Nigerian children's performance on tests of cognitive and neurobehavioral development at age seven years.

Environment international·2025

Same journal

Optimal Weighted Tests for Replication Studies and the 'Two-Trials Rule' With Multiple Hypotheses.

Statistics in medicine·2026

Same journal

Identifiable Copula-Double-Cox Models: A Fully Parametric Framework for Dependent Right-Censored Survival Data.

Statistics in medicine·2026

Same journal

Moving From Individualized Risk-Based Prevention to Benefit-Based Prevention: Estimating Individualized Life-Years Gained From Prevention Services as a Basis for Eligibility.

Statistics in medicine·2026

Same journal

A Mixture of Distributed Lag Non-Linear Models to Account for Spatially Heterogeneous Exposure-Lag-Response Associations.

Statistics in medicine·2026

Same journal

Practical Considerations for Gaussian Process Modeling for Causal Inference in Quasi-Experimental Studies With Panel Data.

Statistics in medicine·2026

Same journal

Covariate Adjustment for Wilcoxon Two Sample Statistic and Test.

Statistics in medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 24, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Optimized variable selection via repeated data splitting.

Marinela Capanu¹, Mihai Giurcanu², Colin B Begg¹

¹Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, USA.

Statistics in Medicine

|April 14, 2020

Summary

This summary is machine-generated.

This study introduces a novel variable selection method for regression models. The new technique effectively identifies relevant predictors, outperforming existing methods in simulations and real-world data analysis.

Keywords:

data splitting empirical threshold linear regression variable screening variable selection

More Related Videos

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Design and Analysis for Fall Detection System Simplification

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

Related Experiment Videos

Last Updated: Dec 24, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Design and Analysis for Fall Detection System Simplification

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

Area of Science:

Statistics
Machine Learning
Biostatistics

Background:

Model selection in high-dimensional data is well-studied.
Advancements in low-dimensional regression variable selection are lacking.
Need for robust variable selection in moderate-scale datasets (n>p).

Purpose of the Study:

Introduce a new variable selection procedure for low to moderate scale regressions.
Provide an empirically optimized threshold for variable screening.
Evaluate the performance of the proposed method against existing techniques.

Main Methods:

Repeatedly split data into estimation and validation sets.
Empirically optimize a threshold for variable inclusion.
Compare performance with backward elimination, univariate screening, adaptive LASSO, and SCAD.

Main Results:

The proposed method demonstrates superior performance in simulations.
Achieves low inclusion of noisy predictors and high power for correct model detection.
Performance is unaffected by correlations among predictors.

Conclusions:

The new variable selection procedure is effective for low to moderate scale regressions.
Offers a robust alternative to existing methods, especially when predictors are correlated.
Successfully applied to a clinical dataset of hepatectomy patients.