Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Survival Tree01:19

Survival Tree

295
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
295
Randomized Experiments01:13

Randomized Experiments

8.7K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
8.7K
Multiple Regression01:25

Multiple Regression

3.6K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.6K
Pharmacokinetic Models: Comparison and Selection Criterion01:26

Pharmacokinetic Models: Comparison and Selection Criterion

249
Physiological and compartmental models are valuable tools used in studying biological systems. These models rely on differential equations to maintain mass balance within the system, ensuring an accurate representation of the dynamic processes at play.
Physiological models take a detailed approach by considering specific molecular processes. They can predict drug distribution, metabolism, and elimination changes, providing a comprehensive understanding of how drugs interact with the body.
249
Comparing the Survival Analysis of Two or More Groups01:20

Comparing the Survival Analysis of Two or More Groups

452
Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...
452
Random Sampling Method01:09

Random Sampling Method

13.9K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
13.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Are older adult research participants representative of the general population? Results from 19 clinical studies at one academic research center.

Contemporary clinical trials·2026
Same author

Effects of a Single Sub-Anesthetic Dose of Ketamine in Tobacco Use Disorder: An Active-Placebo, Randomized Crossover Study.

Brain sciences·2026
Same author

Multi-Omic, Multi-Tissue Responses to Acute Exercise in Sedentary Adults: Findings from the Molecular Transducers of Physical Activity Consortium.

bioRxiv : the preprint server for biology·2026
Same author

Machine Learning-Based Stepping Filter Improves Estimates of Moderate-to-Vigorous-Intensity Physical Activity from Wrist Actigraphy.

Digital biomarkers·2026
Same author

Functional connectivity response to distress-inducing auditory feedback associated with short-term smoking abstinence.

Drug and alcohol dependence·2026
Same author

Molecular Transducers of Physical Activity Consortium (MoTrPAC): Initial Insights into the Dynamic Human Responses to Exercise.

bioRxiv : the preprint server for biology·2026
Same journal

Unlocking 3D baby face photogrammetry: Multi-view BabyMorph reconstruction from uncalibrated photographs.

Expert systems with applications·2026
Same journal

Enhancing Text Datasets With Scaling and Targeting Data Augmentation to Improve BERT-Based Machine Learners.

Expert systems with applications·2026
Same journal

Automatic Bi-Atrial Segmentation and Biomarker Extraction from Late Gadolinium-Enhanced MRI Using Deep Learning.

Expert systems with applications·2026
Same journal

A Two-Stage Proactive Dialogue Generator for Efficient Clinical Information Collection Using Large Language Model.

Expert systems with applications·2026
Same journal

Deep video anomaly detection in automated laboratory setting.

Expert systems with applications·2026
Same journal

Corrigendum to "Identification of gene regulatory networks associated with breast cancer patient survival using an interpretable deep neural network model" [Expert Syst. Appl. 262 (2025) 125632].

Expert systems with applications·2025
See all related articles

Related Experiment Video

Updated: Dec 8, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K

A Comparison of Random Forest Variable Selection Methods for Classification Prediction Modeling.

Jaime Lynn Speiser1, Michael E Miller1, Janet Tooze1

  • 1Department of Biostatistical Sciences, Wake Forest School of Medicine, Winston-Salem, NC 27157, USA.

Expert Systems with Applications
|September 24, 2020
PubMed
Summary
This summary is machine-generated.

For random forest classification, Jiang's method and VSURF are top variable selection choices. For datasets with many predictors, varSelRF and Boruta offer better computational efficiency.

Keywords:
classificationfeature reductionrandom forestvariable selection

More Related Videos

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model
07:13

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

399
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.8K

Related Experiment Videos

Last Updated: Dec 8, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K
Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model
07:13

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

399
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.8K

Area of Science:

  • Machine Learning
  • Computational Statistics
  • Bioinformatics

Background:

  • Random forest classification is widely used for predictive modeling.
  • Reducing variables in prediction models enhances efficiency and lowers data collection costs.
  • Guidance on selecting optimal variable selection methods for random forests is limited.

Purpose of the Study:

  • To evaluate and compare various random forest variable selection methods.
  • To identify preferable methods based on dataset characteristics and computational efficiency.
  • To provide guidance for selecting variable selection techniques in expert and intelligent systems.

Main Methods:

  • Utilized 311 publicly available classification datasets.
  • Assessed prediction error rates, variable count, computation times, and Area Under the Receiver Operating Curve (AUC).
  • Compared standard random forest with conditional random forest methods, and test-based with performance-based methods.

Main Results:

  • Jiang's method and the VSURF R package method are recommended for most datasets.
  • For datasets with numerous predictors, varSelRF and Boruta R packages are more computationally efficient.
  • Performance varied based on dataset type (binary, many predictors, imbalanced outcomes) and method type.

Conclusions:

  • Variable selection is crucial for optimizing random forest models.
  • The choice of variable selection method should consider dataset characteristics and computational constraints.
  • This study offers empirical evidence to guide the selection of random forest variable selection methods.