Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Uncertainty: Confidence Intervals00:54

Uncertainty: Confidence Intervals

3.1K
The confidence interval is the range of values around the mean that contains the true mean. It is expressed as a probability percentage. The interpretation of a 95% confidence interval, for instance, is that the statistician is 95% confident that the true mean falls within the interval. The upper and lower limits of this range are known as confidence limits. The confidence limits for the true mean are estimated from the sample's mean, the standard deviation, and the statistical factor...
3.1K
Cluster Sampling Method01:20

Cluster Sampling Method

11.8K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
11.8K
Uncertainty: Overview00:59

Uncertainty: Overview

529
In analytical chemistry, we often perform repetitive measurements to detect and minimize inaccuracies caused by both determinate and indeterminate errors. Despite the cares we take, the presence of random errors means that repeated measurements almost never have exactly the same magnitude. The collective difference between these measurements - observed values - and the estimated or expected value is called uncertainty. Uncertainty is conventionally written after the estimated or expected value.
529
Propagation of Uncertainty from Random Error00:59

Propagation of Uncertainty from Random Error

657
An experiment often consists of more than a single step. In this case, measurements at each step give rise to uncertainty. Because the measurements occur in successive steps, the uncertainty in one step necessarily contributes to that in the subsequent step. As we perform statistical analysis on these types of experiments, we must learn to account for the propagation of uncertainty from one step to the next. The propagation of uncertainty depends on the type of arithmetic operation performed on...
657
Propagation of Uncertainty from Systematic Error01:10

Propagation of Uncertainty from Systematic Error

490
The atomic mass of an element varies due to the relative ratio of its isotopes. A sample's relative proportion of oxygen isotopes influences its average atomic mass. For instance, if we were to measure the atomic mass of oxygen from a sample, the mass would be a weighted average of the isotopic masses of oxygen in that sample. Since a single sample is not likely to perfectly reflect the true atomic mass of oxygen for all the molecules of oxygen on Earth, the mass we obtain from this...
490
Probability Histograms01:17

Probability Histograms

11.1K
A probability histogram is a visual representation of a probability distribution. Similar a typical histogram, the probability histogram consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents. The vertical axis is labeled with probability. Each rectangular bar in the histogram is 1 unit wide, which suggests that the area under each bar equals the probability, P(x), where x is 1, 2, 3, and so on.
11.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Cell-type-resolved genetic variation shapes inflammatory bowel disease risk.

Nature·2026
Same author

Design and interpretation of eQTL-GWAS colocalisation studies: Lessons from a large-scale evaluation.

PLoS genetics·2026
Same author

Altered B cell activation contributes to the immunopathogenesis of childhood arthritis-associated uveitis.

Nature communications·2026
Same author

High-resolution promoter interaction analysis implicates genes involved in the activation of Type 3 Innate Lymphoid Cells in autoimmune disease risk.

bioRxiv : the preprint server for biology·2026
Same author

Outcome-guided spike-and-slab Lasso Biclustering: A Novel Approach for Enhancing Biclustering Techniques for Gene Expression Analysis.

Statistics and computing·2025
Same author

Exploiting pleiotropy to enhance variant discovery with functional false discovery rates.

Nature computational science·2025
Same journal

Detection, communication, and individual identification with deep audio embeddings: A case study with North Atlantic right whales.

PLoS computational biology·2026
Same journal

Exploring the structural lexicon of the Proteome via Metric Geometry.

PLoS computational biology·2026
Same journal

Linking retinal sampling in neural encoding models to temporal profiles of visual processing in humans.

PLoS computational biology·2026
Same journal

CAdir: Joint clustering of cells and genes for single-cell transcriptomics with visualization-driven cluster quality assessment.

PLoS computational biology·2026
Same journal

Systematic design of auxotrophic strains and media conditions to probe metabolic functions in E. coli.

PLoS computational biology·2026
Same journal

Neuronal excitability and parameter variability in the Hodgkin-Huxley model.

PLoS computational biology·2026
See all related articles

Related Experiment Video

Updated: Jun 14, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

6.9K

Bayesian clustering with uncertain data.

Kath Nicholls1,2, Paul D W Kirk1,2,3, Chris Wallace1,2

  • 1Cambridge Institute of Therapeutic Immunology and Infectious Disease, University of Cambridge, Cambridge, United Kingdom.

Plos Computational Biology
|September 3, 2024
PubMed
Summary
This summary is machine-generated.

We developed Dirichlet Process Mixtures with Uncertainty (DPMUnc), a novel clustering method that effectively uses data uncertainty. DPMUnc improves disease classification, particularly for immune-mediated diseases (IMD), and enables gene signature analysis in new datasets.

More Related Videos

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.4K
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
12:39

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

11.3K

Related Experiment Videos

Last Updated: Jun 14, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

6.9K
ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.4K
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
12:39

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

11.3K

Area of Science:

  • Bioinformatics and Computational Biology
  • Statistical Learning and Data Mining
  • Genomics and Systems Biology

Background:

  • Clustering is a fundamental technique in bioinformatics and other fields for data analysis and prediction.
  • Existing clustering methods often fail to incorporate data uncertainty or measurement error, limiting their effectiveness.
  • Immune-mediated diseases (IMD) represent a complex group of disorders requiring sophisticated analytical approaches for subtyping.

Purpose of the Study:

  • To introduce Dirichlet Process Mixtures with Uncertainty (DPMUnc), a novel Bayesian nonparametric clustering algorithm designed to leverage data uncertainty.
  • To demonstrate the superior performance of DPMUnc compared to existing methods using simulated and real-world biological data.
  • To develop and validate a new procedure for applying gene signatures to datasets where they were not originally discovered.

Main Methods:

  • Developed DPMUnc, an extension of Bayesian nonparametric clustering that explicitly incorporates data uncertainty.
  • Applied DPMUnc to cluster immune-mediated diseases (IMD) using genome-wide association study (GWAS) summary statistics, accounting for sample size uncertainty.
  • Introduced a novel procedure for summarizing gene expression data using gene signatures, including gene expression variability, for cross-dataset application.

Main Results:

  • DPMUnc significantly outperformed existing clustering methods on simulated data.
  • Clustering of IMD using GWAS data with DPMUnc successfully separated autoimmune from autoinflammatory diseases and identified subgroups like adult-onset arthritis.
  • Clustering of gene expression datasets from IMD patients using summarized gene signatures showed disease associations and consistent structures across datasets.

Conclusions:

  • Data uncertainty should be actively incorporated into clustering algorithms, and DPMUnc provides an effective method for this purpose.
  • The novel gene signature summarization procedure enables robust analysis of gene expression data across different datasets and disease contexts.
  • DPMUnc and the gene signature application method offer valuable tools for advancing the understanding and classification of complex diseases like IMD.