Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Outliers and Influential Points

Outliers and Influential Points

An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...

Testing a Claim about Population Proportion

Testing a Claim about Population Proportion

A complete procedure for testing a claim about a population proportion is provided here.
There are two methods of testing a claim about a population proportion: (1) Using the sample proportion from the data where a binomial distribution is approximated to the normal distribution and (2) Using the binomial probabilities calculated from the data.
The first method uses normal distribution as an approximation to the binomial distribution. The requirements are as follows: sample size is large...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

What Are Outliers?

What Are Outliers?

Outliers are observed data points that are far from the least squares line. They have unusual values and need to be examined carefully. Though an outlier may result from erroneous data, at other times, it may hold valuable information about the population under study and should be included in the data. Hence, it is crucial to examine what causes a data point to be an outlier.
The z score is used to find outliers or unusual values. It should be noted that any values beyond -2 and +2 are...

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...

Choosing Between z and t Distribution

Choosing Between z and t Distribution

The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Evaluating vectors for the design of a spillover-disrupting Lassa virus transmissible vaccine.

PLoS computational biology·2026

Same author

From single-sequences to evolutionary trajectories: protein language models capture the evolutionary potential of SARS-CoV-2.

Nature communications·2026

Same author

Integration of bioinformatic tools for the detection of SARS-CoV-2 co-infection cases.

Microbial genomics·2026

Same author

The potential of H5N1 viruses to adapt to bovine cells varies throughout evolution.

Nature communications·2025

Same author

Avian-origin influenza A viruses tolerate elevated pyrexic temperatures in mammals.

Science (New York, N.Y.)·2025

Same author

ViralBottleneck: an R package for estimating viral transmission bottlenecks from deep sequencing data using multiple methods.

Virus evolution·2025

Same journal

Optimal Weighted Tests for Replication Studies and the 'Two-Trials Rule' With Multiple Hypotheses.

Statistics in medicine·2026

Same journal

Identifiable Copula-Double-Cox Models: A Fully Parametric Framework for Dependent Right-Censored Survival Data.

Statistics in medicine·2026

Same journal

Moving From Individualized Risk-Based Prevention to Benefit-Based Prevention: Estimating Individualized Life-Years Gained From Prevention Services as a Basis for Eligibility.

Statistics in medicine·2026

Same journal

A Mixture of Distributed Lag Non-Linear Models to Account for Spatially Heterogeneous Exposure-Lag-Response Associations.

Statistics in medicine·2026

Same journal

Practical Considerations for Gaussian Process Modeling for Causal Inference in Quasi-Experimental Studies With Panel Data.

Statistics in medicine·2026

Same journal

Covariate Adjustment for Wilcoxon Two Sample Statistic and Test.

Statistics in medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 27, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Identifying potential significant factors impacting zero-inflated proportion data.

Mélina Ribaud¹, Edith Gabriel¹, Joseph Hughes²

¹INRAE, BioSP, Avignon, France.

Statistics in Medicine

|June 8, 2023

Summary

This summary is machine-generated.

This study introduces a novel permutation-based method to identify key factors influencing zero-inflated proportion data (ZIPD). The approach effectively explains correlations and predicts response variable ranks in epidemiological data.

Keywords:

COVID-19 Spearman's correlation equine influenza performance indicator permutation test ranking

More Related Videos

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Related Experiment Videos

Last Updated: Jul 27, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Area of Science:

Biostatistics
Epidemiology
Data Science

Background:

Classical supervised methods struggle with dependent, continuous, bounded, and zero-inflated proportion data (ZIPD).
Identifying significant factors for ZIPD is crucial in fields like epidemiology.

Purpose of the Study:

To propose a novel within-block permutation-based methodology for identifying factors impacting ZIPD.
To develop a performance indicator for quantifying explained correlation by significant factors.
To enable prediction of response variable ranks based on observed factors.

Main Methods:

A within-block permutation-based approach is utilized.
The methodology identifies discrete or continuous factors significantly correlated with ZIPD.
A performance indicator is proposed to measure the explained correlation percentage.

Main Results:

The proposed methodology effectively identifies significant factors for ZIPD.
A performance indicator quantifies the explanatory power of identified factors.
The approach successfully predicts response variable ranks in simulated and real-world epidemiological data.

Conclusions:

The developed methodology offers a robust solution for analyzing ZIPD.
This approach enhances understanding of transmission probabilities (e.g., Influenza) and mortality dynamics (e.g., COVID-19).
The method provides valuable tools for epidemiological research and data analysis.