Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Pharmacokinetic models are mathematical constructs that represent and predict the time course of drug concentrations in the body, providing meaningful pharmacokinetic parameters. These models are categorized into compartment, physiological, and distributed parameter models.
The distributed parameter models are specifically designed to account for variations and differences in some drug classes. This model is particularly useful for assessing regional concentrations of anticancer or...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...

Model Approaches for Pharmacokinetic Data: Compartment Models

Model Approaches for Pharmacokinetic Data: Compartment Models

Compartmental analysis is a widely adopted approach to characterizing drug pharmacokinetics. It uses compartment models that conceptualize the body as a collection of reversibly communicating compartments, each representing a group of tissues exhibiting similar drug distribution characteristics. The movement rate of the drug between these compartments is typically described by first-order kinetics.
Two primary types of compartment models are recognized: mammillary and catenary. The more...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Nonparametric estimation of the total treatment effect with multiple outcomes in the presence of terminal events.

Biometrics·2026

Same author

A versatile multi-components mixed model for bacterial-Genome Wide association studies.

Nature communications·2026

Same author

Interpreting Treatment Effects Using Posterior Probabilities: A Bayesian Reanalysis of 230 Phase III Oncology Trials.

JCO clinical cancer informatics·2026

Same author

Gene-by-Sleep Duration Interaction for Glycemic Traits in over 480,000 Individuals.

medRxiv : the preprint server for health sciences·2026

Same author

Large Impact of Genetic Data Processing Steps on Stability and Reproducibility of Set-Based Analyses in Genome-Wide Association Studies.

Genetics·2026

Same author

Using Days Alive and Out of Hospital as the Study Endpoint in Cardiovascular Heart Failure Clinical Trials.

JACC. Heart failure·2026

Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026

Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026

Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 21, 2025

ARL Spectral Fitting as an Application to Augment Spectral Data via Franck-Condon Lineshape Analysis and Color Analysis

ARL Spectral Fitting as an Application to Augment Spectral Data via Franck-Condon Lineshape Analysis and Color Analysis

Published on: August 19, 2021

Fitting Gaussian mixture models on incomplete data.

Zachary R McCaw¹, Hugues Aschard², Hanna Julienne²

¹School of Public Health, Harvard T.H. Chan, 677 Huntington Ave, Boston, MA, 02115, USA. zmccaw@alumni.harvard.edu.

BMC Bioinformatics

|June 1, 2022

Summary

This summary is machine-generated.

Missing data in bioinformatics is common. The new missingness-aware Gaussian mixture models (MGMM) R package accurately identifies clusters in incomplete datasets, outperforming existing methods for improved data analysis.

Keywords:

Clustering Gaussian mixture models Missing data

More Related Videos

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Author Spotlight: Exploring Light-Driven Chemical Reactions and Energy-Harnessing Devices in Photochemical Research

Author Spotlight: Exploring Light-Driven Chemical Reactions and Energy-Harnessing Devices in Photochemical Research

Published on: February 16, 2024

Related Experiment Videos

Last Updated: Sep 21, 2025

ARL Spectral Fitting as an Application to Augment Spectral Data via Franck-Condon Lineshape Analysis and Color Analysis

ARL Spectral Fitting as an Application to Augment Spectral Data via Franck-Condon Lineshape Analysis and Color Analysis

Published on: August 19, 2021

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Author Spotlight: Exploring Light-Driven Chemical Reactions and Energy-Harnessing Devices in Photochemical Research

Author Spotlight: Exploring Light-Driven Chemical Reactions and Energy-Harnessing Devices in Photochemical Research

Published on: February 16, 2024

Area of Science:

Bioinformatics
Computational Biology
Data Science

Background:

Bioinformatics research often integrates diverse datasets, leading to incomplete data with missing values.
Existing Gaussian Mixture Models (GMMs) struggle with missing data, often requiring restrictive assumptions or leading to biased results via complete case analysis or imputation.

Purpose of the Study:

To introduce missingness-aware Gaussian mixture models (MGMM), an R package designed to fit GMMs robustly in the presence of missing data.
To provide a statistically sound and user-friendly tool for clustering and density estimation with incomplete datasets.

Main Methods:

Development of the MGMM R package, which accommodates missing data without imposing restrictions on the covariance matrix.
Evaluation using three case studies involving real and simulated 'omics data.

Main Results:

MGMM demonstrated superior performance in recovering true cluster assignments compared to existing GMM implementations and standard GMMs with imputation.
The package accurately assesses cluster assignment uncertainty, even when data distributions deviate from a true GMM.

Conclusions:

MGMM significantly improves cluster assignment recovery across various datasets and missingness rates compared to state-of-the-art methods.
MGMM offers a powerful, accessible, and statistically valid solution for bioinformatics analyses involving missing data.