Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Truncation in Survival Analysis01:09

Truncation in Survival Analysis

333
Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are...
333
Parametric Survival Analysis: Weibull and Exponential Methods01:14

Parametric Survival Analysis: Weibull and Exponential Methods

667
Parametric survival analysis models survival data by assuming a specific probability distribution for the time until an event occurs. The Weibull and exponential distributions are two of the most commonly used methods in this context, due to their versatility and relatively straightforward application.
Weibull Distribution
The Weibull distribution is a flexible model used in parametric survival analysis. It can handle both increasing and decreasing hazard rates, depending on its shape parameter...
667
Censoring Survival Data01:09

Censoring Survival Data

264
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
264
Choosing Between z and t Distribution01:25

Choosing Between z and t Distribution

3.2K
The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...
3.2K
Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

89
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
89
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

314
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
314

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A model selection criterion for clustered survival analysis with informative cluster size.

Pharmaceutical statistics·2022
Same author

Model selection based on resampling approaches for cluster longitudinal data with missingness in outcomes.

Statistics in medicine·2018
Same author

Model selection for semiparametric marginal mean regression accounting for within-cluster subsampling variability and informative cluster size.

Biometrics·2018
Same author

Joint model selection of marginal mean regression and correlation structure for longitudinal data with missing outcome and covariates.

Biometrical journal. Biometrische Zeitschrift·2017
Same journal

Optimal Weighted Tests for Replication Studies and the 'Two-Trials Rule' With Multiple Hypotheses.

Statistics in medicine·2026
Same journal

Identifiable Copula-Double-Cox Models: A Fully Parametric Framework for Dependent Right-Censored Survival Data.

Statistics in medicine·2026
Same journal

Moving From Individualized Risk-Based Prevention to Benefit-Based Prevention: Estimating Individualized Life-Years Gained From Prevention Services as a Basis for Eligibility.

Statistics in medicine·2026
Same journal

A Mixture of Distributed Lag Non-Linear Models to Account for Spatially Heterogeneous Exposure-Lag-Response Associations.

Statistics in medicine·2026
Same journal

Practical Considerations for Gaussian Process Modeling for Causal Inference in Quasi-Experimental Studies With Panel Data.

Statistics in medicine·2026
Same journal

Covariate Adjustment for Wilcoxon Two Sample Statistic and Test.

Statistics in medicine·2026
See all related articles

Related Experiment Video

Updated: Sep 26, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K

Distribution-free model selection for longitudinal zero-inflated count data with missing responses and covariates.

Chun-Shu Chen1, Chung-Wei Shen2

  • 1Graduate Institute of Statistics, National Central University, Taoyuan, Taiwan, Republic of China.

Statistics in Medicine
|April 16, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces a new method for analyzing complex count data with many zeros and missing values, common in medical and social sciences. The approach helps identify key factors influencing outcomes, even with incomplete data.

Keywords:
generalized estimating equationsmissing at randomtwo-component mixture modelsvariable selectionzero-inflation

More Related Videos

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.4K
A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

10.8K

Related Experiment Videos

Last Updated: Sep 26, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.4K
A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

10.8K

Area of Science:

  • Biostatistics
  • Epidemiology
  • Longitudinal Data Analysis

Background:

  • Count data with excess zeros and clustered correlations are prevalent in medical and social science research.
  • Existing models like zero-inflated binomial (ZIB), negative binomial (ZINB), and Poisson (ZIP) have limitations with missing data and assumption deviations.
  • Semiparametric weighted generalized estimating equations offer a robust approach for handling missingness in longitudinal count data.

Purpose of the Study:

  • To propose a distribution-free model selection criterion for identifying important covariates in longitudinal count data with excess zeros and missingness.
  • To evaluate the performance of the proposed covariate selection method under various scenarios of excess zeros and missing data.
  • To illustrate the application of the method using a real-world cardiovascular disease dataset.

Main Methods:

  • Development of a model selection criterion based on expected weighted quadratic loss for covariate selection.
  • Application of semiparametric weighted generalized estimating equations to handle non-monotone missingness in responses and covariates.
  • Simulation studies to assess covariate selection effects under different percentages of excess zeros and missing data.

Main Results:

  • The proposed model selection criterion effectively identifies relevant covariates without assuming data distribution.
  • The method demonstrates robustness in scenarios with substantial excess zeros and non-monotone missingness.
  • The real data example on cardiovascular disease illustrates the practical utility of the approach.

Conclusions:

  • The developed distribution-free covariate selection method provides a valuable tool for analyzing complex longitudinal count data.
  • This approach enhances the reliability of statistical inference in the presence of excess zeros and missing data.
  • The findings have significant implications for studies in medical and social sciences where such data characteristics are common.