Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Censoring Survival Data

Censoring Survival Data

Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...

Introduction to Nonparametric Statistics

Introduction to Nonparametric Statistics

Nonparametric statistics offer a powerful alternative to traditional parametric methods, useful when assumptions about the population distribution cannot be made. Unlike parametric tests, which require data to follow a specific distribution with well-defined parameters (such as the mean and standard deviation), nonparametric tests do not require such constraints. This makes them particularly valuable when dealing with small sample sizes, skewed data, or ordinal and categorical variables.
One of...

Estimating Population Mean with Unknown Standard Deviation

Estimating Population Mean with Unknown Standard Deviation

In practice, we rarely know the population standard deviation. In the past, when the sample size was large, this did not present a problem to statisticians. They used the sample standard deviation s as an estimate for σ and proceeded as before to calculate a confidence interval with close enough results. However, statisticians ran into problems when the sample size was small. A small sample size caused inaccuracies in the confidence interval.
William S. Gosset (1876–1937) of the...

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Artificial Intelligence in Rare Diseases: Workflow-Integrated Precision Kidney Care.

Clinics and practice·2026

Same author

Real-world, multi-omics validation of the clinical relevance of molecular taxonomy for myelodysplastic syndromes (MDS).

HemaSphere·2026

Same author

The evolving field of nephrology: what comes next? A report from the European Renal Association Scientific Advisory Board.

Clinical kidney journal·2026

Same author

Artificial intelligence in the care of systemic lupus erythematosus: Current capabilities and translational barriers.

Lupus·2026

Same author

Sarcopenia in Kidney Transplantation: Bridging Pathophysiology to Patient-Centered Care.

Nutrients·2026

Same author

Understanding the Molecular Mechanisms Underlying Anemia in Myelodysplastic Syndromes: From Erythropoiesis to New Therapeutic Approaches.

Blood cancer discovery·2026

Same journal

A Bayesian functional concurrent zero-inflated Dirichlet-multinomial regression model with application to infant microbiome.

Biostatistics (Oxford, England)·2026

Same journal

Towards optimal environmental policies: policy learning under arbitrary bipartite network interference.

Biostatistics (Oxford, England)·2026

Same journal

Multilevel functional quantile principal component analysis.

Biostatistics (Oxford, England)·2026

Same journal

Adaptive transfer learning for time-to-event modeling with applications in disease risk assessment.

Biostatistics (Oxford, England)·2026

Same journal

High-dimensional test for one-sided hypotheses.

Biostatistics (Oxford, England)·2026

Same journal

NBSR: a Negative Binomial Softmax Regression model for microRNA-seq data analysis.

Biostatistics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 14, 2025

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

A Bayesian nonparametric approach to correct for underreporting in count data.

Serena Arima¹, Silvia Polettini², Giuseppe Pasculli³

¹Department of Human and Social Sciences, University of Salento, Via di Valesio, 73100, LECCE, Italy.

Biostatistics (Oxford, England)

|October 9, 2023

Summary

This summary is machine-generated.

This study introduces a new statistical model to accurately estimate underreported disease prevalence, like chronic kidney disease in Italy. The model improves disease surveillance and management by accounting for data quality issues.

Keywords:

Chronic kidney disease (CKD)Compound Poisson distribution Data quality Dependent Dirichlet process MCMC underreporting

More Related Videos

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Development of New Methods for Quantifying Fish Density Using Underwater Stereo-video Tools

Development of New Methods for Quantifying Fish Density Using Underwater Stereo-video Tools

Published on: November 20, 2017

Related Experiment Videos

Last Updated: Jul 14, 2025

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Development of New Methods for Quantifying Fish Density Using Underwater Stereo-video Tools

Development of New Methods for Quantifying Fish Density Using Underwater Stereo-video Tools

Published on: November 20, 2017

Area of Science:

Biostatistics
Epidemiology
Public Health

Background:

Accurate disease prevalence estimation is crucial for public health monitoring and management.
Count data often suffers from underreporting, particularly in heterogeneous regions.
Existing methods may not adequately address data quality issues and underreporting.

Purpose of the Study:

To propose a novel nonparametric compound Poisson model for underreported count data.
To incorporate latent clustering of reporting probabilities into the model.
To accurately estimate disease prevalence, using chronic kidney disease in Apulia, Italy as a case study.

Main Methods:

Developed a nonparametric compound Poisson model with latent clustering for reporting probabilities.
Estimated model parameters using expert opinion and a proxy for the reporting process.
Applied the model to a unique database of 258 municipalities in Apulia, Italy.

Main Results:

The model provided accurate prevalence estimates for chronic kidney disease in Apulia.
Results revealed interesting geographical patterns of the disease within the region.
The model demonstrated accuracy and suitability for data with partial quality information when compared to existing approaches using simulated and real data.

Conclusions:

The proposed model effectively addresses underreported count data by modeling reporting probability heterogeneity.
It offers a valuable tool for accurate disease surveillance and management, especially in data-scarce or data-quality-challenged settings.
The approach is versatile and validated through application to both chronic kidney disease and early neonatal mortality risk data.