Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Bootstrapping01:24

Bootstrapping

The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is small or...
Stratified Sampling Method01:16

Stratified Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...
Systematic Sampling Method01:17

Systematic Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
Systematic sampling is one of the simplest methods...
Cluster Sampling Method01:20

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
Longitudinal Studies01:26

Longitudinal Studies

Longitudinal studies are also widely used in other medical and social science fields. For instance, in cardiovascular research, they can monitor patients' health over decades to identify risk factors for heart disease, such as high cholesterol or smoking, and evaluate the long-term effectiveness of preventive measures. Similarly, in mental health studies, researchers might follow individuals from adolescence into adulthood to understand the development and progression of conditions like...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Medication-Wide Association Study of Alzheimer's Disease and Related Dementias: Identifying Drug Candidates from Electronic Health Records through Explainable AI.

medRxiv : the preprint server for health sciences·2026
Same author

Characteristics and Outcomes of Over 1 Million Veterans With Heart Failure Phenotyped Using Artificial Intelligence Approaches: the National DCVA-HF Registry.

Journal of cardiac failure·2026
Same author

Beware the Little Foxes that Spoil the Vines: Small Inconsistencies in Clinical Data Can Distort Machine Learning Findings.

Fortune journal of health sciences·2026
Same author

Target-Dose Versus Below-Target-Dose ACE Inhibitors and Lower Risk of Kidney Failure in U.S. Veterans with HFrEF.

European journal of heart failure·2026
Same author

Serum Magnesium and Outcomes in U.S. Veterans with Heart Failure.

The American journal of medicine·2026
Same author

Coding Fairness: Detecting Demographic-Related Coding Discrepancies in ICD Code Assignments.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026
Same journal

Evaluation of temporal preservation in synthetic longitudinal patient data.

Journal of biomedical informatics·2026
Same journal

ARKE: An ontology-driven framework for automated mapping of local radiology procedure terms to the LOINC-RadLex playbook using large language model.

Journal of biomedical informatics·2026
Same journal

A validation-driven training controller for cross-lingual biomedical NER via reinforcement learning-based adaptive loss weighting.

Journal of biomedical informatics·2026
Same journal

ASP-HR: An Adaptive Spatial Perception and Hierarchical Reasoning mechanism for document-level biomedical relation extraction.

Journal of biomedical informatics·2026
Same journal

Beyond Accuracy: Safety-Centered guidelines for the evaluation of LLM-based therapy recommendation systems for chronic multimorbidity patients.

Journal of biomedical informatics·2026
Same journal

DeepEN: A deep reinforcement learning framework for personalized enteral nutrition in critical care.

Journal of biomedical informatics·2026
See all related articles

Related Experiment Video

Updated: May 27, 2026

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

A bootstrapping algorithm to improve cohort identification using structured data.

Sasikiran Kandula1, Qing Zeng-Treitler1, Lingji Chen2

  • 1Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, United States.

Journal of Biomedical Informatics
|November 15, 2011
PubMed
Summary
This summary is machine-generated.

This study introduces a novel bootstrapping method for accurate patient cohort identification, improving upon traditional ICD-9 codes by incorporating lab results and medications. This approach enhances clinical research by identifying more patients with conditions like Diabetes Mellitus and Hyperlipidemia.

Related Experiment Videos

Last Updated: May 27, 2026

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Area of Science:

  • Clinical Informatics
  • Biomedical Data Science
  • Health Services Research

Background:

  • Accurate patient cohort identification is crucial for clinical research.
  • Traditional methods using International Classification of Diseases, Ninth Revision (ICD-9) codes have limitations in precision for many conditions.
  • There is a need for more accurate and automated methods for cohort identification.

Purpose of the Study:

  • To develop and evaluate a novel bootstrapping method for enhanced patient cohort identification.
  • To improve the accuracy of identifying cohorts for conditions such as Diabetes Mellitus (DM) and Hyperlipidemia (HL).
  • To demonstrate a method that does not require pre-existing true class labels for patients.

Main Methods:

  • A bootstrapping method was developed, integrating ICD-9 codes with laboratory results and medication data.
  • Classification models were built using this integrated data to identify patient cohorts.
  • The method was applied to a large database of 800,000 patients to identify DM and HL cohorts.

Main Results:

  • The proposed method identified 11,000 patients as positive for DM and 52,000 for HL, who lacked corresponding ICD-9 codes.
  • Clinical chart reviews of 400 patients (200 per condition) indicated the bootstrapping method's labeling was more consistent with clinician assessments than ICD-9 codes alone.
  • The approach demonstrated a higher degree of accuracy compared to solely relying on ICD-9 codes.

Conclusions:

  • The developed bootstrapping method offers a more accurate and automated approach to patient cohort identification.
  • This method supplements traditional coding systems by incorporating diverse clinical data, improving precision.
  • The approach holds significant potential for cost-effective and reliable cohort identification in clinical research.