Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Biostatistics: Overview

Biostatistics: Overview

Biostatistics plays a crucial role in understanding and analyzing data in healthcare and biology. Biostatisticians conduct experiments, gather evidence, and draw meaningful conclusions using statistical methods and techniques. Different variables form the foundation of biostatistical analysis, allowing researchers to understand and interpret data effectively. These variables are classified into different types, each serving a specific purpose in statistical analysis.
Discrete variables are...

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Pharmacokinetic models are mathematical constructs that represent and predict the time course of drug concentrations in the body, providing meaningful pharmacokinetic parameters. These models are categorized into compartment, physiological, and distributed parameter models.
The distributed parameter models are specifically designed to account for variations and differences in some drug classes. This model is particularly useful for assessing regional concentrations of anticancer or...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Pharmacokinetic Models: Comparison and Selection Criterion

Pharmacokinetic Models: Comparison and Selection Criterion

Physiological and compartmental models are valuable tools used in studying biological systems. These models rely on differential equations to maintain mass balance within the system, ensuring an accurate representation of the dynamic processes at play.
Physiological models take a detailed approach by considering specific molecular processes. They can predict drug distribution, metabolism, and elimination changes, providing a comprehensive understanding of how drugs interact with the body.

Statistical Methods for Analyzing Epidemiological Data

Statistical Methods for Analyzing Epidemiological Data

Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:

Statistical Software for Data Analysis and Clinical Trials

Statistical Software for Data Analysis and Clinical Trials

Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Annealed variational mixtures for disease subtyping and biomarker discovery.

Statistical applications in genetics and molecular biology·2026

Same author

Outcome-guided spike-and-slab Lasso Biclustering: A Novel Approach for Enhancing Biclustering Techniques for Gene Expression Analysis.

Statistics and computing·2025

Same author

Inferring differential subcellular localisation in comparative spatial proteomics using BANDLE.

Nature communications·2022

Same author

Topological approximate Bayesian computation for parameter inference of an angiogenesis model.

Bioinformatics (Oxford, England)·2022

Same author

Detection of quantitative trait loci from RNA-seq data with or without genotypes using BaseQTL.

Nature computational science·2022

Same author

The RNA landscape of the human placenta in health and disease.

Nature communications·2021

Same journal

Region-aware bridge modeling enables interpretable mesoscale representation of spatial transcriptomic tissue sections.

Bioinformatics advances·2026

Same journal

Microbiome differential abundance methodologies to detect relevant taxa associated with chemotherapy toxicity rate in colorectal cancer.

Bioinformatics advances·2026

Same journal

maldipickr dereplicates microbial MALDI-TOF spectra to facilitate multiplexed isolation.

Bioinformatics advances·2026

Same journal

RAM-MSA: an anytime memory-bounded method for exact multiple sequence alignment using path finding.

Bioinformatics advances·2026

Same journal

Interpretable machine learning for low-sample multi-omics: a case study of ferret vaccine response.

Bioinformatics advances·2026

Same journal

DeepTaxa: a hybrid CNN-BERT framework for 16S rRNA taxonomic classification.

Bioinformatics advances·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 15, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

VICatMix: variational Bayesian clustering and variable selection for discrete biomedical data.

Jackie Rao¹, Paul D W Kirk^1,2,3

¹MRC Biostatistics Unit, University of Cambridge, Cambridge, CB2 0SR, United Kingdom.

Bioinformatics Advances

|April 10, 2025

Summary

This summary is machine-generated.

VICatMix, a new clustering model, efficiently analyzes high-dimensional categorical data for precision medicine. It improves patient stratification and disease subtyping by using variational inference for speed and accuracy.

More Related Videos

Basics of Multivariate Analysis in Neuroimaging Data

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

Related Experiment Videos

Last Updated: May 15, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Basics of Multivariate Analysis in Neuroimaging Data

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

Area of Science:

Computational biology
Bioinformatics
Statistical genetics

Background:

Clustering biomedical data is vital for patient stratification in precision medicine.
High-dimensional categorical data, like 'omics data, require computationally efficient algorithms.
Existing methods struggle with scalability and accuracy for complex datasets.

Purpose of the Study:

To introduce VICatMix, a variational Bayesian finite mixture model for categorical data clustering.
To enhance computational efficiency and scalability in clustering high-dimensional biomedical data.
To enable accurate patient stratification and discovery of disease subtypes.

Main Methods:

Developed VICatMix, a variational Bayesian finite mixture model for categorical data.
Implemented variational inference for computationally efficient training and scalability.
Incorporated variable selection, summarization, and model averaging for improved performance.

Main Results:

VICatMix outperforms existing methods in computational time and scalability while maintaining accuracy.
The model effectively performs variable selection on high-dimensional, noisy data.
Demonstrated utility in cancer subtyping and driver gene discovery using The Cancer Genome Atlas data.

Conclusions:

VICatMix offers a computationally efficient and accurate solution for clustering high-dimensional categorical biomedical data.
The model facilitates precise patient stratification and the discovery of novel disease subtypes through integrative analysis.
VICatMix is available as an R package for broader research application.