Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Binomial Probability Distribution01:15

Binomial Probability Distribution

13.6K
A binomial distribution is a probability distribution for a procedure with a fixed number of trials, where each trial can have only two outcomes.
The outcomes of a binomial experiment fit a binomial probability distribution. A statistical experiment can be classified as a binomial experiment if the following conditions are met:
There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.
There are only two possible outcomes,...
13.6K
Probability in Statistics01:14

Probability in Statistics

18.7K
Probability is the likelihood of an event occurring. The term event is defined as a collection of results of a procedure. An event is a simple event when an outcome cannot be divided into simpler parts.
An example of a simple event is a coin toss. The result of a coin toss is either a head or a tail. Here, head and tail are two simple events. These two simple events make up the sample space. Further, the probability of an event occurring falls within the range of 0 to 1. The probability of an...
18.7K
Random Variables01:09

Random Variables

15.7K
A random variable is a single numerical value that indicates the outcome of a procedure. The concept of random variables is fundamental to the probability theory and was introduced by a Russian mathematician, Pafnuty Chebyshev, in the mid-nineteenth century.
Uppercase letters such as X or Y denote a random variable. Lowercase letters like x or y denote the value of a random variable. If X is a random variable, then X is written in words, and x is given as a number.
For example, let X = the...
15.7K
Statistical Analysis: Overview01:11

Statistical Analysis: Overview

10.6K
When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...
10.6K
Probability Histograms01:17

Probability Histograms

12.5K
A probability histogram is a visual representation of a probability distribution. Similar a typical histogram, the probability histogram consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents. The vertical axis is labeled with probability. Each rectangular bar in the histogram is 1 unit wide, which suggests that the area under each bar equals the probability, P(x), where x is 1, 2, 3, and so on.
12.5K
Probability Distributions01:32

Probability Distributions

10.1K
 The probability of a random variable x  is the likelihood of its occurrence. A probability distribution represents the probabilities of a random variable using a formula, graph, or table. There are two types of probability distribution– discrete probability distribution and continuous probability distribution.
A discrete probability distribution is a probability distribution of discrete random variables. It can be categorized into binomial probability distribution and Poisson...
10.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Metagenome-scale Modeling to Assess Microbiome Metabolic Complementarity for Precision Microbiota Transplantation Therapies.

bioRxiv : the preprint server for biology·2026
Same author

Emergent eukaryotic directional sensing via receptor degradation and diffusion.

Proceedings of the National Academy of Sciences of the United States of America·2025
Same author

Decay in transcriptional information flow is a hallmark of cellular aging.

bioRxiv : the preprint server for biology·2025
Same author

Non-equilibrium strategies enabling ligand specificity by signaling receptors.

eLife·2025
Same author

Designing host-associated microbiomes using the consumer/resource model.

mSystems·2024
Same author

Directional Sensing by Eukaryotic Receptors.

bioRxiv : the preprint server for biology·2024
Same journal

Another 10 years of PLOS Computational Biology: A data-driven reflection on trends in genomics research.

PLoS computational biology·2026
Same journal

Mobility data resolution needed to inform predictive models of spatial epidemic spread from mobile phone data.

PLoS computational biology·2026
Same journal

DeepMethylation: A deep learning framework for tissue-specific DNA methylation prediction and functional variant annotation.

PLoS computational biology·2026
Same journal

Redefining and estimating the early-phase reproduction ratio for epidemic outbreaks in spatially structured populations.

PLoS computational biology·2026
Same journal

Optimized phenotype definitions boost GWAS power.

PLoS computational biology·2026
Same journal

Detection, communication, and individual identification with deep audio embeddings: A case study with North Atlantic right whales.

PLoS computational biology·2026
See all related articles

Related Experiment Video

Updated: Oct 25, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.7K

SiGMoiD: A super-statistical generative model for binary data.

Xiaochuan Zhao1, Germán Plata2, Purushottam D Dixit1,3

  • 1Department of Physics, University of Florida, Gainesville, Florida, United States of America.

Plos Computational Biology
|August 6, 2021
PubMed
Summary
This summary is machine-generated.

Super-statistical Generative Model for binary Data (SiGMoiD) infers constraints directly from data, enabling efficient probabilistic modeling of large binary variable collections. This approach models complex biological data with over 1000 variables, identifying clusters and reducing dimensionality.

More Related Videos

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.4K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.0K

Related Experiment Videos

Last Updated: Oct 25, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.7K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.4K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

1.0K

Area of Science:

  • Computational Biology
  • Statistical Modeling
  • Machine Learning

Background:

  • Probabilistic models for co-varying binary variables are crucial in computational biology.
  • Existing generative models face computational expense and require manual constraint identification for large datasets (N~100).

Purpose of the Study:

  • To introduce Super-statistical Generative Model for binary Data (SiGMoiD), a novel framework to address limitations in modeling large binary datasets.
  • To develop a computationally efficient method for inferring constraints directly from data, enabling scalable probabilistic modeling.

Main Methods:

  • SiGMoiD utilizes a maximum entropy-based framework, conceptualizing data as arising from a super-statistical system.
  • The algorithm infers constraints directly from the data, bypassing the need for manual specification.
  • The model handles a large number of binary variables (N>1000) and provides a reduced dimensional data description.

Main Results:

  • SiGMoiD successfully models collections of very large numbers of binary variables (N>1000).
  • The framework infers optimal constraints directly from data, enhancing efficiency and scalability.
  • Reduced dimensionality allows for effective identification of data point and variable clusters.

Conclusions:

  • SiGMoiD offers a versatile and efficient solution for building probabilistic generative models for large-scale binary data.
  • The method's ability to infer constraints and reduce dimensionality makes it suitable for diverse biological datasets across various scales.
  • SiGMoiD advances computational biology by enabling more scalable and insightful analysis of complex binary datasets.