Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Censoring Survival Data01:09

Censoring Survival Data

236
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
236
Causality in Epidemiology01:21

Causality in Epidemiology

834
Causality or causation is a fundamental concept in epidemiology, vital for understanding the relationships between various factors and health outcomes. Despite its importance, there's no single, universally accepted definition of causality within the discipline. Drawing from a systematic review, causality in epidemiology encompasses several definitions, including production, necessary and sufficient, sufficient-component, counterfactual, and probabilistic models. Each has its strengths and...
834
Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

86
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
86
Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

533
Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
533
Steps in Outbreak Investigation01:18

Steps in Outbreak Investigation

204
In the ever-evolving field of public health, statistical analysis serves as a cornerstone for understanding and managing disease outbreaks. By leveraging various statistical tools, health professionals can predict potential outbreaks, analyze ongoing situations, and devise effective responses to mitigate impact. For that to happen, there are a few possible stages of the analysis:
204
Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

209
Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...
209

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same authorSame journal

Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data.

Journal of machine learning research : JMLR·2026
Same author

Insights into intraspecific variation and genotyping of <i>Ganoderma lingzhi</i> through pan-mitogenome analysis.

IMA fungus·2026
Same author

Dynamics of Singlet Fission in the TIPS-Pn Cluster: Endothermic or Exothermic?

The journal of physical chemistry letters·2026
Same author

Comprehensive analysis of the chloroplast genome structure and phylogeny of <i>Glochidion puberum</i> (L.) Hutch.

Mitochondrial DNA. Part B, Resources·2026
Same author

Microwave digestion-ICP-MS coupled with molecular docking: unraveling elemental distribution and its correlation with glucose and fructose accumulation in 25 strawberry cultivars.

Food chemistry·2026
Same author

The complete chloroplast genome and phylogenetic analysis of <i>Cephalanthus tetrandrus</i> (Roxb.) Ridsdale & Bakh.f.

Mitochondrial DNA. Part B, Resources·2026
Same journal

Classification Under Local Differential Privacy with Model Reversal and Model Averaging.

Journal of machine learning research : JMLR·2026
Same journal

Heterogeneity-aware Clustered Distributed Learning for Multi-source Data Analysis.

Journal of machine learning research : JMLR·2026
Same journal

Unsupervised Tree Boosting for Learning Probability Distributions.

Journal of machine learning research : JMLR·2026
Same journal

A Two-Stage Penalized Least Squares Method for Constructing Large Systems of Structural Equations.

Journal of machine learning research : JMLR·2026
Same journal

Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes.

Journal of machine learning research : JMLR·2026
See all related articles

Related Experiment Video

Updated: Sep 11, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K

Model-Based Causal Discovery for Zero-Inflated Count Data.

Junsouk Choi1, Yang Ni2

  • 1Department of Statistics, Texas A&M University, College Station, TX 98195-4322, USA.

Journal of Machine Learning Research : JMLR
|August 12, 2025
PubMed
Summary
This summary is machine-generated.

We introduce a new zero-inflated generalized hypergeometric directed acyclic graph (ZiG-DAG) model to uncover causal relationships from observational count data with excess zeros. This flexible model accurately captures complex data features and outperforms existing methods in causal structure learning.

Keywords:
Bayesian networkCausal identifiabilityDirected acyclic graphObservational zero-inflated count dataSingle-cell RNA-sequencing

More Related Videos

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.2K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.3K

Related Experiment Videos

Last Updated: Sep 11, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K
An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.2K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.3K

Area of Science:

  • Statistics
  • Bioinformatics
  • Genomics

Background:

  • Zero-inflated count data are prevalent across scientific disciplines, including social science, biology, and genomics.
  • Existing causal discovery methods struggle to accommodate the excess zeros and overdispersion common in multivariate count data.

Purpose of the Study:

  • Propose a novel zero-inflated generalized hypergeometric directed acyclic graph (ZiG-DAG) model for causal inference from observational zero-inflated count data.
  • Develop a flexible framework capable of modeling diverse zero-inflated count data types and accommodating both linear and nonlinear causal relationships.

Main Methods:

  • The ZiG-DAG model leverages a generalized hypergeometric probability distribution family for flexible data modeling.
  • Causal structure identifiability is proven using a general technique applicable to count data.
  • Score-based algorithms are employed for efficient causal structure learning.

Main Results:

  • The proposed ZiG-DAG model demonstrates superior performance in discovering causal structures from observational zero-inflated count data compared to state-of-the-art methods.
  • Extensive synthetic experiments and a real-world dataset with known ground truth validate the model's effectiveness.
  • The method successfully reverse-engineered a gene regulatory network from single-cell RNA-sequencing data, showcasing practical utility.

Conclusions:

  • The ZiG-DAG model offers a robust and flexible approach for causal discovery from complex zero-inflated count data.
  • The identifiability proof and developed algorithms provide a strong foundation for future causal inference research in this domain.
  • The model's application in bioinformatics highlights its potential for unraveling biological networks and driving scientific discovery.