Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a survival tree begins...

Random Variables

Random Variables

A random variable is a single numerical value that indicates the outcome of a procedure. The concept of random variables is fundamental to the probability theory and was introduced by a Russian mathematician, Pafnuty Chebyshev, in the mid-nineteenth century.
Uppercase letters such as X or Y denote a random variable. Lowercase letters like x or y denote the value of a random variable. If X is a random variable, then X is written in words, and x is given as a number.
For example, let X = the...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance, comparing...

Statistical Hypothesis Testing

Statistical Hypothesis Testing

Hypothesis testing is a critical statistical procedure facilitating informed, evidence-based decisions. It begins with a hypothesis, which is a tentative explanation, or a prediction about a population parameter. This hypothesis can be either a null hypothesis (H0), indicating no effect or difference, or an alternative hypothesis (Ha), suggesting an effect or difference.
Statistical significance measures the probability that an observed result occurred by chance. If this probability, known as...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Real-World Validation of PinPoint Blood Tests in the NHS: Multivariable Machine Learning to Predict Cancer Risk in Primary Care Urgent Referrals.

Mayo Clinic proceedings. Digital health·2026

Same author

Annealed variational mixtures for disease subtyping and biomarker discovery.

Statistical applications in genetics and molecular biology·2026

Same author

Semi-supervised Bayesian integration of multiple spatial proteomics datasets.

PLoS computational biology·2025

Same author

Outcome-guided spike-and-slab Lasso Biclustering: A Novel Approach for Enhancing Biclustering Techniques for Gene Expression Analysis.

Statistics and computing·2025

Same author

VICatMix: variational Bayesian clustering and variable selection for discrete biomedical data.

Bioinformatics advances·2025

Same author

Bayesian clustering with uncertain data.

PLoS computational biology·2024

Same journal

Analysis of strength degradation of coal and rock masses and stability of mined areas under long term immersion environment.

PloS one·2026

Same journal

Biogenic Silver-Selenium nanocomposite with anticancer activity and potent efficacy against vancomycin-resistant Staphylococcus aureus.

PloS one·2026

Same journal

Preparation and physicochemical characterization of a biodegradable chitosan/carboxymethyl cellulose hydrogel synthesized in NaOH/urea medium.

PloS one·2026

Same journal

Action-guilt, survivor-guilt, and depression in combat-related PTSD.

PloS one·2026

Same journal

Explainable machine learning for predicting activities of daily living at discharge in stroke patients: A retrospective study using SHAP interpretability.

PloS one·2026

Same journal

Deep learning based two-way feature depiction model for brain tumor detection.

PloS one·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 12, 2026

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

Accelerating Bayesian hierarchical clustering of time series data with a randomised algorithm.

Robert Darkins¹, Emma J Cooke, Zoubin Ghahramani

¹Systems Biology Centre, University of Warwick, Coventry, United Kingdom.

|April 9, 2013

Summary

This summary is machine-generated.

A new randomized algorithm significantly speeds up Bayesian Hierarchical Clustering (BHC) for time series data, like gene expression, with minimal impact on accuracy. This innovation aids in analyzing large biological datasets.

More Related Videos

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Related Experiment Videos

Last Updated: May 12, 2026

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Area of Science:

Computational Biology
Statistical Computing
Bioinformatics

Background:

The era of big data necessitates advanced statistical algorithms for experimental data analysis.
Faster algorithms are crucial for handling large genomic datasets and applying sophisticated statistical methods.

Purpose of the Study:

To present a randomized algorithm that accelerates time series data clustering using Bayesian Hierarchical Clustering (BHC).
To apply and analyze this algorithm for microarray gene expression data analysis.

Main Methods:

Development and analysis of a randomized algorithm for Bayesian Hierarchical Clustering (BHC).
Application of the algorithm to discretely sampled time series data, with a focus on microarray gene expression data.
Evaluation using both synthetic and real biological datasets.

Main Results:

The randomized BHC algorithm demonstrates substantial improvements in speed.
Clustering quality is minimally affected, showing high fidelity with the original method.
The algorithm is effective for analyzing large-scale gene expression datasets.

Conclusions:

The randomized time series BHC algorithm offers significant speed enhancements for clustering, crucial for big data applications in bioinformatics.
The method provides a practical tool for researchers analyzing complex biological datasets, with software available in the R package BHC.