Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

Strategies for Assessing and Addressing Confounding

Strategies for Assessing and Addressing Confounding

Confounding is a critical issue in epidemiological studies, often leading to misleading conclusions about associations between exposures and outcomes. It occurs when the relationship between the exposure and the outcome is mixed with the effects of other factors that influence the outcome. Given that, addressing confounding is of high importance for drawing accurate inferences in research.
Confounding can be addressed at both the design phase of a study and through analytical methods after data...

Study Design in Statistics

Study Design in Statistics

A study design is a set of techniques that allow a researcher to collect and analyze data from different variables defined for a specific research problem. Statistics is commonly for effective study design and more robust experiments,
Does aspirin reduce the risk of heart attacks? Is one brand of fertilizer more effective at growing roses than another? Is fatigue as dangerous to a driver as the influence of alcohol? Questions like these are answered using randomized experiments with proper...

Types of Biopharmaceutical Studies: Controlled and Non-Controlled Approaches

Types of Biopharmaceutical Studies: Controlled and Non-Controlled Approaches

Biopharmaceutical studies constitute a vital field aiming to enhance drug delivery methods and refine therapeutic approaches, drawing upon diverse interdisciplinary knowledge. In research methodologies, the choice between controlled and non-controlled studies significantly influences the study's reliability and accuracy.
Non-controlled studies, commonly employed for initial exploration, lack a control group, rendering them susceptible to biases and external influences. In contrast,...

Kaplan-Meier Approach

Kaplan-Meier Approach

The Kaplan-Meier estimator is a non-parametric method used to estimate the survival function from time-to-event data. In medical research, it is frequently employed to measure the proportion of patients surviving for a certain period after treatment. This estimator is fundamental in analyzing time-to-event data, making it indispensable in clinical trials, epidemiological studies, and reliability engineering. By estimating survival probabilities, researchers can evaluate treatment effectiveness,...

Statistical Methods for Analyzing Epidemiological Data

Statistical Methods for Analyzing Epidemiological Data

Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Semi-Supervised Topological Analysis for Elucidating Hidden Structures in High-Dimensional Transcriptome Datasets.

IEEE/ACM transactions on computational biology and bioinformatics·2019

Same author

A data science approach for the classification of low-grade and high-grade ovarian serous carcinomas.

BMC genomics·2018

Same author

GCRNN: Group-Constrained Convolutional Recurrent Neural Network.

IEEE transactions on neural networks and learning systems·2018

Same author

Correction: Performance of next-generation sequencing on small tumor specimens and/or low tumor content samples using a commercially available platform.

PloS one·2018

Same author

Performance of next-generation sequencing on small tumor specimens and/or low tumor content samples using a commercially available platform.

PloS one·2018

Same author

Tangent hyperplane kernel principal component analysis for denoising.

IEEE transactions on neural networks and learning systems·2014

Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026

Same journal

KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.

Bioinformatics (Oxford, England)·2026

Same journal

IDR searcher: a search engine solution for public image resources.

Bioinformatics (Oxford, England)·2026

Same journal

KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.

Bioinformatics (Oxford, England)·2026

Same journal

Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.

Bioinformatics (Oxford, England)·2026

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 5, 2026

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Matched Forest: supervised learning for high-dimensional matched case-control studies.

Nooshin Shomal Zadeh¹, Sangdi Lin², George C Runger¹

¹School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ 85281, USA.

Bioinformatics (Oxford, England)

|October 18, 2019

Summary

This summary is machine-generated.

Matched Forest (MF) offers a novel approach for variable selection in high-dimensional matched case-control studies. This method effectively identifies key exposure variables and their interactions, improving upon existing techniques.

More Related Videos

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Related Experiment Videos

Last Updated: Jan 5, 2026

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Area of Science:

Biostatistics
Epidemiology
Computational Biology

Background:

Matched case-control studies are essential in biomedical research for identifying health condition-associated exposures.
Traditional variable selection methods struggle with high-dimensional data and complex variable interactions.

Purpose of the Study:

To introduce a flexible and effective method for variable selection in high-dimensional matched case-control data.
To address the limitations of existing methods in detecting interaction effects.

Main Methods:

The study presents Matched Forest (MF), a novel algorithm based on the potential outcome model.
MF transforms matched case-control data by incorporating counterfactuals.
Variable importance is assessed using a modified score from a supervised learner.

Main Results:

Simulation studies demonstrate MF's efficacy in identifying significant variables.
The algorithm successfully detects interaction effects among variables.
MF is applied to biomedical data, showing competitive performance against alternative methods.

Conclusions:

Matched Forest provides a robust and adaptable solution for variable selection in complex epidemiological studies.
The method's ability to handle high-dimensional data and interactions enhances its utility in biomedical research.
MF is accessible through readily available software tools.