Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

638
Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
638
Statistical Software for Data Analysis and Clinical Trials01:12

Statistical Software for Data Analysis and Clinical Trials

975
Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...
975
Survival Tree01:19

Survival Tree

190
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
190
Random Sampling Method01:09

Random Sampling Method

13.1K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
13.1K
Wald-Wolfowitz Runs Test I01:17

Wald-Wolfowitz Runs Test I

768
The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...
768
Randomized Experiments01:13

Randomized Experiments

8.3K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
8.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
Same author

Impact of Imaging Protocols on Thermal Detection of Pressure Injuries: Threshold versus Deep Learning Across Skin Tones.

medRxiv : the preprint server for health sciences·2026
Same author

Correction: Impact of skin tone, environmental, and technical factors on thermal imaging.

PloS one·2026
Same author

GDA-AM: ON THE EFFECTIVENESS OF SOLVING MIN-IMAX OPTIMIZATION VIA ANDERSON MIXING.

... International Conference on Learning Representations·2026
Same author

Beyond Composite Indices: Comprehensive Social Determinants Improve Heart Failure Readmission Prediction.

Journal of the American Heart Association·2026
Same author

Industry payments to cardiologists are associated with higher Medicare spending.

The American journal of managed care·2026
Same journal

Variational Learning of Individual Survival Distributions.

Proceedings of the ACM Conference on Health, Inference, and Learning·2022
Same journal

Deidentification of free-text medical records using pre-trained bidirectional transformers.

Proceedings of the ACM Conference on Health, Inference, and Learning·2021
Same journal

Multiple Instance Learning for Predicting Necrotizing Enterocolitis in Premature Infants Using Microbiome Data.

Proceedings of the ACM Conference on Health, Inference, and Learning·2021
Same journal

MMiDaS-AE: Multi-modal Missing Data aware Stacked Autoencoder for Biomedical Abstract Screening.

Proceedings of the ACM Conference on Health, Inference, and Learning·2021
Same journal

TASTE: Temporal and Static Tensor Factorization for Phenotyping Electronic Health Records.

Proceedings of the ACM Conference on Health, Inference, and Learning·2021
Same journal

Adverse Drug Reaction Discovery from Electronic Health Records with Deep Neural Networks.

Proceedings of the ACM Conference on Health, Inference, and Learning·2020
See all related articles

Related Experiment Video

Updated: Oct 26, 2025

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model
07:13

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

288

CaliForest: Calibrated Random Forest for Health Data.

Yubin Park1, Joyce C Ho2

  • 1Emory University Bonsai Research, LLC.

Proceedings of the ACM Conference on Health, Inference, and Learning
|July 26, 2021
PubMed
Summary
This summary is machine-generated.

CaliForest improves risk prediction models by enhancing calibration without needing extra data. This new method ensures more accurate healthcare predictions for personalized medicine.

Keywords:
Applied computing→Health informaticsBaggingComputing methodologies→Classification and regression treesGeneral and reference→Empirical studiescalibrationhealthcarepythonrandom forest

More Related Videos

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.3K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K

Related Experiment Videos

Last Updated: Oct 26, 2025

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model
07:13

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

288
An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.3K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K

Area of Science:

  • Healthcare Analytics
  • Machine Learning in Medicine
  • Biostatistics

Background:

  • Predictive models in healthcare require evaluation of both discrimination and calibration.
  • Calibration, the accuracy of risk estimates, is often neglected in favor of discrimination.
  • Accurate calibration is vital for personalized medicine and clinical decision-making.

Purpose of the Study:

  • To introduce CaliForest, a novel calibrated random forest algorithm.
  • To address the common neglect of calibration in healthcare predictive modeling.
  • To provide a method that avoids explicit calibration sets by using out-of-bag samples.

Main Methods:

  • Developed CaliForest, a random forest algorithm incorporating calibration.
  • Utilized out-of-bag samples within the random forest framework for calibration.
  • Evaluated CaliForest on two binary risk prediction tasks using the MIMIC-III database.

Main Results:

  • CaliForest achieved comparable discrimination to standard random forest.
  • CaliForest demonstrated superior model calibration across six different metrics.
  • The proposed method effectively integrated calibration into random forest models.

Conclusions:

  • CaliForest offers a robust solution for improving the calibration of random forest models in healthcare.
  • The method enhances the reliability of risk predictions for personalized medicine.
  • Open-source availability facilitates adoption and further research in calibrated machine learning for health.