Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

85
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
85
Randomized Experiments01:13

Randomized Experiments

7.2K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
7.2K
Random Variables01:09

Random Variables

13.4K
A random variable is a single numerical value that indicates the outcome of a procedure. The concept of random variables is fundamental to the probability theory and was introduced by a Russian mathematician, Pafnuty Chebyshev, in the mid-nineteenth century.
Uppercase letters such as X or Y denote a random variable. Lowercase letters like x or y denote the value of a random variable. If X is a random variable, then X is written in words, and x is given as a number.
For example, let X = the...
13.4K
Model Approaches for Pharmacokinetic Data: Distributed Parameter Models01:06

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

121
Pharmacokinetic models are mathematical constructs that represent and predict the time course of drug concentrations in the body, providing meaningful pharmacokinetic parameters. These models are categorized into compartment, physiological, and distributed parameter models.
The distributed parameter models are specifically designed to account for variations and differences in some drug classes. This model is particularly useful for assessing regional concentrations of anticancer or...
121
Random Sampling Method01:09

Random Sampling Method

12.1K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
12.1K
Survival Tree01:19

Survival Tree

154
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
154

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Detection of previously undiagnosed conditions in midlife preventive health examinations.

Scientific reports·2026
Same author

Multiple linear regression modeling with values below a lower limit of quantification - a statistical method comparison.

BMC medical research methodology·2026
Same author

Associations Among in-The-Moment Emotional Clarity, Emotion Regulation, and Psychopathology in Obsessive-Compulsive Disorder.

Depression and anxiety·2025
Same author

The general health status of employees in Germany aged 45 to 59 years (Ü45-Check) - a cross-sectional study.

BMC public health·2025
Same author

Evaluation of a validated questionnaire to assess the need for prevention or rehabilitation by preventive health examinations: a cross-sectional study of German employees aged 45 to 59 years (Ü45-check).

Frontiers in public health·2025
Same author

Prediction Modeling With Many Correlated and Zero-Inflated Predictors: Assessing the Nonnegative Garrote Approach.

Statistics in medicine·2025
Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026
Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026
Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026
Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026
Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026
Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Sep 6, 2025

Integrating Remote Sensing with Species Distribution Models; Mapping Tamarisk Invasions Using the Software for Assisted Habitat Modeling SAHM
12:26

Integrating Remote Sensing with Species Distribution Models; Mapping Tamarisk Invasions Using the Software for Assisted Habitat Modeling SAHM

Published on: October 11, 2016

13.4K

Using Background Knowledge from Preceding Studies for Building a Random Forest Prediction Model: A Plasmode

Lorena Hafermann1, Nadja Klein2, Geraldine Rauch1

  • 1Institute of Biometry and Clinical Epidemiology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, 10117 Berlin, Germany.

Entropy (Basel, Switzerland)
|June 24, 2022
PubMed
Summary
This summary is machine-generated.

Using external information to guide random forest (RF) models improved calibration but not overall prediction accuracy in patient outcome prediction. Appraising the quality of external data sources is recommended for machine learning development.

Keywords:
calibrationmachine learningsparsityvariable selection

More Related Videos

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K
Detection and Quantification of Plasmodium falciparum in Aqueous Red Blood Cells by Attenuated Total Reflection Infrared Spectroscopy and Multivariate Data Analysis
10:50

Detection and Quantification of Plasmodium falciparum in Aqueous Red Blood Cells by Attenuated Total Reflection Infrared Spectroscopy and Multivariate Data Analysis

Published on: November 2, 2018

8.1K

Related Experiment Videos

Last Updated: Sep 6, 2025

Integrating Remote Sensing with Species Distribution Models; Mapping Tamarisk Invasions Using the Software for Assisted Habitat Modeling SAHM
12:26

Integrating Remote Sensing with Species Distribution Models; Mapping Tamarisk Invasions Using the Software for Assisted Habitat Modeling SAHM

Published on: October 11, 2016

13.4K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K
Detection and Quantification of Plasmodium falciparum in Aqueous Red Blood Cells by Attenuated Total Reflection Infrared Spectroscopy and Multivariate Data Analysis
10:50

Detection and Quantification of Plasmodium falciparum in Aqueous Red Blood Cells by Attenuated Total Reflection Infrared Spectroscopy and Multivariate Data Analysis

Published on: November 2, 2018

8.1K

Area of Science:

  • Medical Informatics
  • Statistical Modeling
  • Machine Learning

Background:

  • Machine learning (ML) algorithms, like random forest (RF), are increasingly used for predicting patient outcomes by identifying complex data patterns.
  • External information is sometimes used to enhance variable selection accuracy and model interpretability in ML.
  • The benefit of external information for RF and ML prediction models remains unclear.

Purpose of the Study:

  • To investigate the utility of external information from previous variable selection studies for RF models.
  • To assess if external information improves prediction quality and calibration in ML models.

Main Methods:

  • A plasmode simulation study was conducted using a subsampled dataset from a large pharmacoepidemiologic study (nearly 200,000 individuals).
  • The study included two binary outcomes and 1152 candidate predictor variables.
  • External information from traditional statistical modeling (Lasso) and univariate selection was used to reduce the scope of candidate predictors for RF models.

Main Results:

  • Reducing the number of candidate predictors based on external knowledge led to improved calibration in RF models.
  • Prediction quality, measured by cross-entropy, AUROC, and Brier score, did not show improvement.
  • The study highlights that while calibration may benefit, overall predictive performance did not increase with externally guided variable selection.

Conclusions:

  • External information can enhance the calibration of random forest models in patient outcome prediction.
  • Overall prediction quality metrics (cross-entropy, AUROC, Brier score) were not improved by using external information for variable selection.
  • It is crucial to critically evaluate the methodological quality of external data sources used in developing future prediction models.