Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Development of a LC-MS-based simultaneous protein quantification method for all regulated allergenic foods in Korea.

Food chemistry·2026

Same author

Targeted maximum likelihood estimation (TMLE) in regulatory submissions and research: a landscape analysis.

The international journal of biostatistics·2026

Same author

Understanding research priorities in home health care: a topic modeling analysis of the health services literature.

BMC health services research·2026

Same author

Mapping Temporal Dynamics in Hospice Research: A Topic Modeling Analysis of PubMed Abstracts.

Journal of palliative medicine·2026

Same author

Image Tracing of Inflammatory Intestinal Organoids via Computational Clearing.

Nanomaterials (Basel, Switzerland)·2026

Same author

Lactiplantibacillus plantarum DS1073 protects the gastric mucosa under Helicobacter pylori infection and ethanol stress.

Cell communication and signaling : CCS·2026

Same journal

Can the All of Us sample be reweighted to mirror a nationally representative sample? A comparison of mortality predictors.

Epidemiology (Cambridge, Mass.)·2026

Same journal

Gut health, systemic inflammation, and linear growth among Indonesian infants: findings from the Action Against Stunting Hub observation cohort: Erratum.

Epidemiology (Cambridge, Mass.)·2026

Same journal

Evaluating Estimators in Partially Identified Models.

Epidemiology (Cambridge, Mass.)·2026

Same journal

Stratification and accumulation? Explaining changing mortality inequities between business owners and non-owners in the U.S. (1984-2022).

Epidemiology (Cambridge, Mass.)·2026

Same journal

Be wary of age-stratum aging in early-onset cancer trends.

Epidemiology (Cambridge, Mass.)·2026

Same journal

The Authors Respond.

Epidemiology (Cambridge, Mass.)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 8, 2026

Reduced Procedure Time and Variability with Active Esophageal Cooling During Radiofrequency Ablation for Atrial Fibrillation

Reduced Procedure Time and Variability with Active Esophageal Cooling During Radiofrequency Ablation for Atrial Fibrillation

Published on: August 25, 2022

An Expedited Chart Review Process for Large Database Studies Using Natural Language Processing and Multiwave Adaptive

Shirley V Wang¹, Georg Hahn¹, Sushama Kattinakere Sreedhara¹

¹From the Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.

Epidemiology (Cambridge, Mass.)

|April 7, 2026

Summary

This summary is machine-generated.

This study introduces an efficient method for validating health outcome algorithms in large databases using natural language processing (NLP) and adaptive sampling. This approach significantly reduces chart review time and resources, improving the reliability of health research findings.

Keywords:

Bayesian Chart review Multiwave sampling NLP Validation study

More Related Videos

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Published on: May 15, 2020

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

Related Experiment Videos

Last Updated: Apr 8, 2026

Reduced Procedure Time and Variability with Active Esophageal Cooling During Radiofrequency Ablation for Atrial Fibrillation

Reduced Procedure Time and Variability with Active Esophageal Cooling During Radiofrequency Ablation for Atrial Fibrillation

Published on: August 25, 2022

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Published on: May 15, 2020

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

Area of Science:

Health Informatics
Biostatistics
Data Science in Healthcare

Background:

Validating code-based algorithms in large claims databases is crucial for enhancing analysis.
Manual chart review of electronic health records is time-consuming and resource-intensive.
Outcome misclassification can bias results in inferential studies.

Purpose of the Study:

To describe an expedited process for validating code-based algorithms.
To introduce efficiency through natural language processing (NLP) and adaptive sampling.
To illustrate the process with a case study validating an algorithm for intentional self-harm in patients with obesity.

Main Methods:

Utilized natural language processing (NLP) to reduce human reviewer time per chart.
Implemented a multi-wave adaptive sampling approach with pre-defined stopping criteria.
Validated a claims-based outcome algorithm for intentional self-harm in an obesity cohort.

Main Results:

The NLP-assisted annotation reduced review time per chart by 40%.
Adaptive sampling with a stopping rule would have avoided reviewing 77% of patient charts.
Sufficient precision of performance characteristics was maintained with limited compromise.

Conclusions:

The described approach facilitates routine validation of code-based algorithms.
This enhances the understanding of reliability in findings from database studies.
Improves the robustness of health research using large datasets.