Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multiple Allele Traits01:49

Multiple Allele Traits

34.4K
The Concept of Multiple Allelism
34.4K
Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

70
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
70
Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

88
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
88
Survival Tree01:19

Survival Tree

128
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
128
Stratified Sampling Method01:16

Stratified Sampling Method

12.1K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...
12.1K
Truncation in Survival Analysis01:09

Truncation in Survival Analysis

261
Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are...
261

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Peat-based soil amendments enhance long-term soil-plant-microbe recovery in boreal mine reclamation.

Journal of environmental management·2026
Same author

Impact of Hepatitis C screening and treatment among incarcerated populations in Alberta, Canada on population-level Hepatitis C elimination efforts.

The International journal on drug policy·2026
Same author

Estimating Chronic Hepatitis B Prevalence and Undiagnosed Proportion in Canada, 2007-2021: Mathematical Framework Development.

JMIR public health and surveillance·2025
Same author

Tests of covariate effects under finite Gaussian mixture regression models.

Journal of applied statistics·2025
Same author

CanFlyet: habitat zone and diet trait dataset for Diptera species of Canada and Greenland.

Biodiversity data journal·2025
Same author

Single specimen genome assembly of Culicoides stellifer shows evidence of a non-retroviral endogenous viral element.

BMC genomics·2025
Same journal

DeepMethylation: A deep learning framework for tissue-specific DNA methylation prediction and functional variant annotation.

PLoS computational biology·2026
Same journal

Redefining and estimating the early-phase reproduction ratio for epidemic outbreaks in spatially structured populations.

PLoS computational biology·2026
Same journal

Optimized phenotype definitions boost GWAS power.

PLoS computational biology·2026
Same journal

Detection, communication, and individual identification with deep audio embeddings: A case study with North Atlantic right whales.

PLoS computational biology·2026
Same journal

Exploring the structural lexicon of the Proteome via Metric Geometry.

PLoS computational biology·2026
Same journal

Linking retinal sampling in neural encoding models to temporal profiles of visual processing in humans.

PLoS computational biology·2026
See all related articles

Related Experiment Video

Updated: Aug 6, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K

A real data-driven simulation strategy to select an imputation method for mixed-type trait data.

Jacqueline A May1, Zeny Feng2, Sarah J Adamowicz1

  • 1Department of Integrative Biology & Biodiversity Institute of Ontario, University of Guelph, Guelph, Ontario, Canada.

Plos Computational Biology
|March 22, 2023
PubMed
Summary
This summary is machine-generated.

Selecting the best method to fill in missing trait data is crucial for biological analyses. A data-driven simulation using squamate traits found random forests with phylogenetic information to be most effective for imputation.

More Related Videos

Barnes Maze Testing Strategies with Small and Large Rodent Models
12:59

Barnes Maze Testing Strategies with Small and Large Rodent Models

Published on: February 26, 2014

42.2K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K

Related Experiment Videos

Last Updated: Aug 6, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K
Barnes Maze Testing Strategies with Small and Large Rodent Models
12:59

Barnes Maze Testing Strategies with Small and Large Rodent Models

Published on: February 26, 2014

42.2K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K

Area of Science:

  • Evolutionary Biology
  • Bioinformatics
  • Comparative Genomics

Background:

  • Missing observations in biological trait datasets hinder analyses across various disciplines.
  • Existing imputation methods yield mixed results, necessitating a framework for selecting appropriate techniques for diverse, real-world datasets.
  • Trait datasets often contain mixed data types (categorical, count, continuous), complicating imputation strategies.

Purpose of the Study:

  • To develop and validate a real data-driven simulation strategy for selecting the optimal imputation method for mixed-type trait datasets.
  • To evaluate the performance of candidate imputation methods, including mean/mode, k-nearest neighbour, random forests, and MICE, with and without phylogenetic information.

Main Methods:

  • A squamate trait dataset was used as a target, with missing data simulated under missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR) mechanisms.
  • Imputation was performed using candidate methods, incorporating phylogenetic information from nuclear, mitochondrial, or multigene trees.
  • Performance was assessed using mean squared error for numerical traits and proportion falsely classified rates for categorical traits.

Main Results:

  • The random forest method, enhanced with a nuclear-derived phylogeny, demonstrated the lowest error rates across most traits.
  • Imputed datasets more accurately reflected the original data's characteristics and distributions compared to complete-case datasets.
  • Phylogenetic information did not consistently improve performance for all traits or scenarios, highlighting the need for careful method selection.

Conclusions:

  • A real data-driven simulation strategy is effective for selecting suitable imputation methods for mixed-type trait datasets.
  • Random forests combined with appropriate phylogenetic data offer a robust approach for trait data imputation in evolutionary biology.
  • Caution is advised, as the utility of phylogenetic information in imputation varies by trait and missingness mechanism.