Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Steps in Outbreak Investigation01:18

Steps in Outbreak Investigation

468
In the ever-evolving field of public health, statistical analysis serves as a cornerstone for understanding and managing disease outbreaks. By leveraging various statistical tools, health professionals can predict potential outbreaks, analyze ongoing situations, and devise effective responses to mitigate impact. For that to happen, there are a few possible stages of the analysis:
468
Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

864
Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
864
Statistical Software for Data Analysis and Clinical Trials01:12

Statistical Software for Data Analysis and Clinical Trials

1.4K
Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...
1.4K
Survival Curves01:18

Survival Curves

620
Survival curves are graphical representations that depict the survival experience of a population over time, offering an intuitive way to track the proportion of individuals who remain event-free at each time point. These curves are widely used in fields such as medicine, public health, and reliability engineering to visualize and compare survival probabilities across different groups or conditions.
The Kaplan-Meier estimator is the most common method for constructing survival curves. This...
620

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Trends in Suicide Mortality by Method among US Individuals aged 10-24 Years from 1999 to 2024.

medRxiv : the preprint server for health sciences·2026
Same author

Underimmunisation during the 2025 Texas measles outbreak.

The Lancet. Infectious diseases·2025
Same author

Data Source Concordance for Infectious Disease Epidemiology.

medRxiv : the preprint server for health sciences·2022
Same author

Machine Learning Maps Research Needs in COVID-19 Literature.

Patterns (New York, N.Y.)·2020
Same journal

Predicting Chemotherapy Response from Staging Laparoscopy Images.

medRxiv : the preprint server for health sciences·2026
Same journal

Development and External Validation of a Machine Learning Model for 10-Year Ischemic Stroke Risk Prediction in Diverse Populations.

medRxiv : the preprint server for health sciences·2026
Same journal

MCH-Guard: Multimodal Machine Learning Framework for Risk Stratification of Cerebral Microhemorrhage Risk in the Alzheimer's Disease Neuroimaging Initiative.

medRxiv : the preprint server for health sciences·2026
Same journal

Genetic and maternal environmental contributions to estimated fetal weight at 20 weeks gestation compared with birthweight.

medRxiv : the preprint server for health sciences·2026
Same journal

Better immediate declarative memory is associated with forgetting during locomotor adaptation in chronic stroke and in older adults.

medRxiv : the preprint server for health sciences·2026
Same journal

An empirical Bayes framework for burden and dispersion association tests helps prioritize rare variants associated with Alzheimer's disease.

medRxiv : the preprint server for health sciences·2026
See all related articles

Related Experiment Video

Updated: Jan 9, 2026

Expedited Radiation Biodosimetry by Automated Dicentric Chromosome Identification ADCI and Dose Estimation
10:33

Expedited Radiation Biodosimetry by Automated Dicentric Chromosome Identification ADCI and Dose Estimation

Published on: September 4, 2017

16.4K

EpiCurveBench: Evaluating epidemic curve digitization.

Thomas Berkane1, Maimuna Majumder1

  • 1Computational Health Informatics Program, Boston Children's Hospital & Harvard Medical School, Boston, MA 02115, USA.

Medrxiv : the Preprint Server for Health Sciences
|December 11, 2025
PubMed
Summary
This summary is machine-generated.

Digitizing disease case count charts (epicurves) is crucial for accurate forecasting models. A new benchmark, EpiCurveBench, and metric, EpiCurve Similarity (ECS), were developed to improve automated epicurve extraction, revealing significant challenges for current methods.

Keywords:
Chart Data ExtractionDatasetEpidemiologyVision-Language Models

More Related Videos

EPA Method 1615. Measurement of Enterovirus and Norovirus Occurrence in Water by Culture and RT-qPCR. Part III. Virus Detection by RT-qPCR
12:32

EPA Method 1615. Measurement of Enterovirus and Norovirus Occurrence in Water by Culture and RT-qPCR. Part III. Virus Detection by RT-qPCR

Published on: January 16, 2016

13.2K
Estimating Virus Production Rates in Aquatic Systems
10:49

Estimating Virus Production Rates in Aquatic Systems

Published on: September 22, 2010

13.0K

Related Experiment Videos

Last Updated: Jan 9, 2026

Expedited Radiation Biodosimetry by Automated Dicentric Chromosome Identification ADCI and Dose Estimation
10:33

Expedited Radiation Biodosimetry by Automated Dicentric Chromosome Identification ADCI and Dose Estimation

Published on: September 4, 2017

16.4K
EPA Method 1615. Measurement of Enterovirus and Norovirus Occurrence in Water by Culture and RT-qPCR. Part III. Virus Detection by RT-qPCR
12:32

EPA Method 1615. Measurement of Enterovirus and Norovirus Occurrence in Water by Culture and RT-qPCR. Part III. Virus Detection by RT-qPCR

Published on: January 16, 2016

13.2K
Estimating Virus Production Rates in Aquatic Systems
10:49

Estimating Virus Production Rates in Aquatic Systems

Published on: September 22, 2010

13.0K

Area of Science:

  • Computational epidemiology
  • Data science
  • Medical informatics

Background:

  • Accurate disease case counts over time are vital for training reliable disease forecasting models.
  • Epidemic curve (epicurve) images, commonly used to display this data, are often in non-machine-readable formats.
  • Manual digitization is time-consuming, and existing automated methods fail with complex, real-world epicurves.

Purpose of the Study:

  • To address the limitations in automated epicurve data extraction.
  • To create a comprehensive benchmark dataset for evaluating epicurve extraction methods.
  • To introduce a novel evaluation metric that accurately assesses temporal data extraction.

Main Methods:

  • Developed EpiCurveBench, a benchmark dataset of 100 manually curated and annotated epicurve images from diverse sources.
  • Introduced EpiCurve Similarity (ECS), a new metric designed to evaluate the temporal structure and accuracy of extracted epicurve data.
  • Evaluated state-of-the-art chart data extraction models on the EpiCurveBench dataset using the ECS metric.

Main Results:

  • The EpiCurveBench dataset includes a wide variety of chart styles, from simple to complex.
  • The novel EpiCurve Similarity (ECS) metric effectively captures temporal structures and handles variations in data length and completeness.
  • The best-performing model achieved only 42.9% ECS on EpiCurveBench, indicating substantial room for improvement in automated epicurve extraction.

Conclusions:

  • Existing automated methods for extracting data from epicurve images require significant improvement.
  • The EpiCurveBench dataset and ECS metric provide a robust platform for advancing research in automated chart data extraction, particularly for epidemiological forecasting.
  • This work facilitates the expansion of machine-readable epidemiological data, crucial for enhancing disease forecasting model accuracy.