Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Survival Tree01:19

Survival Tree

333
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
333
Randomized Experiments01:13

Randomized Experiments

8.7K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
8.7K
Variability: Analysis01:11

Variability: Analysis

377
Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...
377
Multiple Regression01:25

Multiple Regression

3.7K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.7K
One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation01:24

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

1.0K
This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...
1.0K
Variation01:19

Variation

7.7K
An important characteristic of any set of data is the variation in the data. In some data sets, the data values are concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation, which is the square root of variance.
When independent and dependent variables are plotted on a scatter plot, the slope of a line is a value that describes the rate of change between the two...
7.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A pilot study of magnetic resonance fingerprinting and radiomics analysis in autosomal dominant polycystic kidney disease.

Kidney international·2026
Same author

Creation of a Merged Harmonized Large Database of Subject-Level Data from Acute Secondary Prevention Studies in Minor Non-Cardioembolic Stroke or TIA.

International journal of cerebrovascular disease and stroke·2026
Same author

POLQ-driven repair scars shape the immunogenic landscape of homologous recombination-deficient pancreatic cancer.

bioRxiv : the preprint server for biology·2026
Same author

Door-in-door-out times and outcomes in patients with acute ischaemic stroke transferred for endovascular therapy in the USA: a retrospective cohort study.

The Lancet. Neurology·2026
Same author

Letting Biology Validate the EPR Imaging of Tumor pO<sub>2</sub>.

Advances in experimental medicine and biology·2026
Same author

Impact of fine particulate matter air pollution (PM<sub>2·5</sub>) on Nigerian children's performance on tests of cognitive and neurobehavioral development at age seven years.

Environment international·2025
Same journal

Optimal Weighted Tests for Replication Studies and the 'Two-Trials Rule' With Multiple Hypotheses.

Statistics in medicine·2026
Same journal

Identifiable Copula-Double-Cox Models: A Fully Parametric Framework for Dependent Right-Censored Survival Data.

Statistics in medicine·2026
Same journal

Moving From Individualized Risk-Based Prevention to Benefit-Based Prevention: Estimating Individualized Life-Years Gained From Prevention Services as a Basis for Eligibility.

Statistics in medicine·2026
Same journal

A Mixture of Distributed Lag Non-Linear Models to Account for Spatially Heterogeneous Exposure-Lag-Response Associations.

Statistics in medicine·2026
Same journal

Practical Considerations for Gaussian Process Modeling for Causal Inference in Quasi-Experimental Studies With Panel Data.

Statistics in medicine·2026
Same journal

Covariate Adjustment for Wilcoxon Two Sample Statistic and Test.

Statistics in medicine·2026
See all related articles

Related Experiment Video

Updated: Dec 24, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K

Optimized variable selection via repeated data splitting.

Marinela Capanu1, Mihai Giurcanu2, Colin B Begg1

  • 1Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, USA.

Statistics in Medicine
|April 14, 2020
PubMed
Summary
This summary is machine-generated.

This study introduces a novel variable selection method for regression models. The new technique effectively identifies relevant predictors, outperforming existing methods in simulations and real-world data analysis.

Keywords:
data splittingempirical thresholdlinear regressionvariable screeningvariable selection

More Related Videos

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.5K
Design and Analysis for Fall Detection System Simplification
08:05

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

11.1K

Related Experiment Videos

Last Updated: Dec 24, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K
Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.5K
Design and Analysis for Fall Detection System Simplification
08:05

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

11.1K

Area of Science:

  • Statistics
  • Machine Learning
  • Biostatistics

Background:

  • Model selection in high-dimensional data is well-studied.
  • Advancements in low-dimensional regression variable selection are lacking.
  • Need for robust variable selection in moderate-scale datasets (n>p).

Purpose of the Study:

  • Introduce a new variable selection procedure for low to moderate scale regressions.
  • Provide an empirically optimized threshold for variable screening.
  • Evaluate the performance of the proposed method against existing techniques.

Main Methods:

  • Repeatedly split data into estimation and validation sets.
  • Empirically optimize a threshold for variable inclusion.
  • Compare performance with backward elimination, univariate screening, adaptive LASSO, and SCAD.

Main Results:

  • The proposed method demonstrates superior performance in simulations.
  • Achieves low inclusion of noisy predictors and high power for correct model detection.
  • Performance is unaffected by correlations among predictors.

Conclusions:

  • The new variable selection procedure is effective for low to moderate scale regressions.
  • Offers a robust alternative to existing methods, especially when predictors are correlated.
  • Successfully applied to a clinical dataset of hepatectomy patients.