Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Survival Tree01:19

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a survival tree begins...
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).
Truncation in Survival Analysis01:09

Truncation in Survival Analysis

Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are observed.
Assumptions of Survival Analysis01:15

Assumptions of Survival Analysis

Survival models analyze the time until one or more events occur, such as death in biological organisms or failure in mechanical systems. These models are widely used across fields like medicine, biology, engineering, and public health to study time-to-event phenomena. To ensure accurate results, survival analysis relies on key assumptions and careful study design.
Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
Distributions to Estimate Population Parameter01:26

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Reassessment of the Collaborative Normal-Tension Glaucoma Study: Statistical Evidence and Implications for Current Management.

Ophthalmology and therapy·2026
Same author

A framework and analytical exploration for a data-driven update of the Sequential Organ Failure Assessment (SOFA) score in sepsis.

Critical care and resuscitation : journal of the Australasian Academy of Critical Care Medicine·2025
Same author

AI-empowered perturbation proteomics for complex biological systems.

Cell genomics·2024
Same author

Genomic reproducibility in the bioinformatics era.

Genome biology·2024
Same author

Higher-Order Least Squares: Assessing Partial Goodness of Fit of Linear Causal Models.

Journal of the American Statistical Association·2024
Same author

Model selection over partially ordered sets.

Proceedings of the National Academy of Sciences of the United States of America·2024
Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026
Same journal

KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.

Bioinformatics (Oxford, England)·2026
Same journal

IDR searcher: a search engine solution for public image resources.

Bioinformatics (Oxford, England)·2026
Same journal

KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.

Bioinformatics (Oxford, England)·2026
Same journal

Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.

Bioinformatics (Oxford, England)·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: May 28, 2026

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

MissForest--non-parametric missing value imputation for mixed-type data.

Daniel J Stekhoven1, Peter Bühlmann

  • 1Seminar for Statistics, Department of Mathematics, ETH Zurich, Zurich, Switzerland.

Bioinformatics (Oxford, England)
|November 1, 2011
PubMed
Summary
This summary is machine-generated.

This study introduces missForest, a novel iterative imputation method using random forests to handle missing data in mixed-type datasets. MissForest effectively imputes values, outperforming existing methods, especially with complex data relationships.

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Related Experiment Videos

Last Updated: May 28, 2026

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Area of Science:

  • Bioinformatics and Computational Biology
  • Statistical Learning and Data Mining

Background:

  • High-throughput data acquisition frequently results in missing values, problematic for many analytical algorithms requiring complete datasets.
  • Existing imputation methods are often limited to single variable types (continuous or categorical), failing to leverage relationships between mixed-type variables.

Purpose of the Study:

  • To propose and evaluate a non-parametric imputation method capable of simultaneously handling mixed-type variables.
  • To address the limitations of existing imputation techniques that ignore inter-variable type relationships.

Main Methods:

  • Developed an iterative imputation method named missForest, based on random forests.
  • Utilized unpruned classification and regression trees within a multiple imputation framework.
  • Employed out-of-bag error estimates for imputation error assessment without requiring a separate test set.

Main Results:

  • MissForest successfully handles missing values in datasets with mixed variable types, outperforming state-of-the-art methods.
  • Demonstrated superior performance in datasets with suspected complex interactions and non-linear relationships.
  • Out-of-bag error estimates were found to be adequate, with missForest showing computational efficiency and scalability to high-dimensional data.

Conclusions:

  • MissForest provides an effective and efficient solution for missing value imputation in mixed-type, high-dimensional datasets.
  • The method's ability to capture complex data relationships makes it particularly valuable for biological data analysis.