Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...

Bias

Bias

Bias refers to any tendency that prevents a question from being considered unprejudiced. In research, bias occurs when one outcome or answer is selected or encouraged over others in sampling or testing. Bias can occur during any research phase, including study design, data collection, analysis, and publication.
In statistics, a sampling bias is created when a sample is collected from a population, and some members of the population are not as likely to be chosen as others (remember, each member...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Accuracy and Errors in Hypothesis Testing

Accuracy and Errors in Hypothesis Testing

Hypothesis testing is a fundamental statistical tool that begins with the assumption that the null hypothesis H0 is true. During this process, two types of errors can occur: Type I and Type II. A Type I error refers to the incorrect rejection of a true null hypothesis, while a Type II error involves the failure to reject a false null hypothesis.
In hypothesis testing, the probability of making a Type I error, denoted as α, is commonly set at 0.05. This significance level indicates a 5%...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

When to Adjust for Multiple Testing: A Unifying Guiding Principle.

Biometrical journal. Biometrische Zeitschrift·2026

Same author

Donor bone marrow together with recipient regulatory T cells induces chimerism without irradiation in kidney transplantation.

Science translational medicine·2026

Same author

Detection of previously undiagnosed conditions in midlife preventive health examinations.

Scientific reports·2026

Same author

Using routinely collected data for research purposes: challenges and mitigation strategies.

BMJ (Clinical research ed.)·2026

Same author

First attempt success rate of intraosseous access in preterm infants and neonates: a systematic review.

Resuscitation plus·2026

Same author

Sample Size Recalculation in Adaptive Group Sequential Study Designs for Comparing Restricted Mean Survival Times.

Statistics in medicine·2026

Same journal

Methods for incorporating test result information within the high-dimensional propensity score framework: application in UK electronic health record data.

BMC medical research methodology·2026

Same journal

Sparse multi-way DMDC for longitudinal classification in high dimension low sample size data.

BMC medical research methodology·2026

Same journal

Tree-based exploratory identification of predictive biomarkers in non-randomized data.

BMC medical research methodology·2026

Same journal

Comparative evaluation of interrupted time series analytical methods for healthcare quality improvement research: a Monte Carlo simulation study.

BMC medical research methodology·2026

Same journal

Methodological advances in claims-based dementia algorithms: integrating medication and clinical data for medicare populations.

BMC medical research methodology·2026

Same journal

An interpretable XGboost algorithm for predicting 30-day mortality in acute pancreatitis using routine biomarkers.

BMC medical research methodology·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 18, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Statistical model building: Background "knowledge" based on inappropriate preselection causes misspecification.

Lorena Hafermann¹, Heiko Becher², Carolin Herrmann³

¹Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Biometry and Clinical Epidemiology, Charitéplatz 1, Berlin, 10117, Germany. lorena.hafermann@charite.de.

BMC Medical Research Methodology

|September 30, 2021

Summary

This summary is machine-generated.

Relying on "known predictors" from prior studies can lead to inaccurate statistical models. This research shows that even variables identified by multiple studies may be false positives, highlighting the need to carefully evaluate background knowledge sources.

Keywords:

Background knowledge Backward elimination Need for more data sharing Regression model Simulation study Univariable selection Variable selection

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Related Experiment Videos

Last Updated: Oct 18, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Area of Science:

Statistics
Biostatistics
Epidemiology

Background:

Statistical model building often relies on assumed or known variable associations with outcomes.
The reliability of this
background knowledge,
particularly from preceding studies, is often questionable.
Preceding studies may have used suboptimal variable selection methods, compromising the validity of their findings.

Purpose of the Study:

To assess the impact of using variables as "known predictors" based on potentially insufficient prior study data.
To evaluate the reliability of
background knowledge
in statistical model building.

Main Methods:

A simulation study was conducted using randomly generated preceding study datasets.
Variable selection was performed within these datasets.
A variable was designated a "known" predictor if identified as relevant by a predefined number of preceding studies.

Main Results:

Classifying variables as
true
predictors based on multiple preceding studies often resulted in false positives.
Variables not identified by preceding studies may still possess true predictive value.
Inappropriate selection methods in prior studies, such as univariable selection, exacerbate these issues.

Conclusions:

The origin of
background knowledge
must be critically examined.
Information derived from preceding studies can lead to statistical model misspecification if not carefully evaluated.