Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Bootstrapping

Bootstrapping

The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is small or...

Stratified Sampling Method

Stratified Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...

Systematic Sampling Method

Systematic Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
Systematic sampling is one of the simplest methods...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Statistical Methods for Analyzing Epidemiological Data

Statistical Methods for Analyzing Epidemiological Data

Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:

Longitudinal Studies

Longitudinal Studies

Longitudinal studies are also widely used in other medical and social science fields. For instance, in cardiovascular research, they can monitor patients' health over decades to identify risk factors for heart disease, such as high cholesterol or smoking, and evaluate the long-term effectiveness of preventive measures. Similarly, in mental health studies, researchers might follow individuals from adolescence into adulthood to understand the development and progression of conditions like...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Medication-Wide Association Study of Alzheimer's Disease and Related Dementias: Identifying Drug Candidates from Electronic Health Records through Explainable AI.

medRxiv : the preprint server for health sciences·2026

Same author

Characteristics and Outcomes of Over 1 Million Veterans With Heart Failure Phenotyped Using Artificial Intelligence Approaches: the National DCVA-HF Registry.

Journal of cardiac failure·2026

Same author

Beware the Little Foxes that Spoil the Vines: Small Inconsistencies in Clinical Data Can Distort Machine Learning Findings.

Fortune journal of health sciences·2026

Same author

Target-Dose Versus Below-Target-Dose ACE Inhibitors and Lower Risk of Kidney Failure in U.S. Veterans with HFrEF.

European journal of heart failure·2026

Same author

Serum Magnesium and Outcomes in U.S. Veterans with Heart Failure.

The American journal of medicine·2026

Same author

Coding Fairness: Detecting Demographic-Related Coding Discrepancies in ICD Code Assignments.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

Evaluation of temporal preservation in synthetic longitudinal patient data.

Journal of biomedical informatics·2026

Same journal

ARKE: An ontology-driven framework for automated mapping of local radiology procedure terms to the LOINC-RadLex playbook using large language model.

Journal of biomedical informatics·2026

Same journal

A validation-driven training controller for cross-lingual biomedical NER via reinforcement learning-based adaptive loss weighting.

Journal of biomedical informatics·2026

Same journal

ASP-HR: An Adaptive Spatial Perception and Hierarchical Reasoning mechanism for document-level biomedical relation extraction.

Journal of biomedical informatics·2026

Same journal

Beyond Accuracy: Safety-Centered guidelines for the evaluation of LLM-based therapy recommendation systems for chronic multimorbidity patients.

Journal of biomedical informatics·2026

Same journal

DeepEN: A deep reinforcement learning framework for personalized enteral nutrition in critical care.

Journal of biomedical informatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 27, 2026

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

A bootstrapping algorithm to improve cohort identification using structured data.

Sasikiran Kandula¹, Qing Zeng-Treitler¹, Lingji Chen²

¹Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, United States.

Journal of Biomedical Informatics

|November 15, 2011

Summary

This summary is machine-generated.

This study introduces a novel bootstrapping method for accurate patient cohort identification, improving upon traditional ICD-9 codes by incorporating lab results and medications. This approach enhances clinical research by identifying more patients with conditions like Diabetes Mellitus and Hyperlipidemia.

Related Experiment Videos

Last Updated: May 27, 2026

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Area of Science:

Clinical Informatics
Biomedical Data Science
Health Services Research

Background:

Accurate patient cohort identification is crucial for clinical research.
Traditional methods using International Classification of Diseases, Ninth Revision (ICD-9) codes have limitations in precision for many conditions.
There is a need for more accurate and automated methods for cohort identification.

Purpose of the Study:

To develop and evaluate a novel bootstrapping method for enhanced patient cohort identification.
To improve the accuracy of identifying cohorts for conditions such as Diabetes Mellitus (DM) and Hyperlipidemia (HL).
To demonstrate a method that does not require pre-existing true class labels for patients.

Main Methods:

A bootstrapping method was developed, integrating ICD-9 codes with laboratory results and medication data.
Classification models were built using this integrated data to identify patient cohorts.
The method was applied to a large database of 800,000 patients to identify DM and HL cohorts.

Main Results:

The proposed method identified 11,000 patients as positive for DM and 52,000 for HL, who lacked corresponding ICD-9 codes.
Clinical chart reviews of 400 patients (200 per condition) indicated the bootstrapping method's labeling was more consistent with clinician assessments than ICD-9 codes alone.
The approach demonstrated a higher degree of accuracy compared to solely relying on ICD-9 codes.

Conclusions:

The developed bootstrapping method offers a more accurate and automated approach to patient cohort identification.
This method supplements traditional coding systems by incorporating diverse clinical data, improving precision.
The approach holds significant potential for cost-effective and reliable cohort identification in clinical research.