Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cluster Sampling Method01:20

Cluster Sampling Method

13.9K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
13.9K
Sampling Methods: Overview01:06

Sampling Methods: Overview

2.1K
A sample refers to a smaller subset representative of a larger population. In analytical chemistry, studying or analyzing an entire population is often impractical or impossible. Therefore, samples are used to draw inferences and generalize the whole population. The sampling method selects individuals or items from a population to create a sample. Standard sampling methods include random, judgemental, systematic, stratified, and cluster sampling. 
In analytical chemistry, the choice of...
2.1K
Convenience Sampling Method00:55

Convenience Sampling Method

10.8K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population.
Convenience sampling is a non-random method of sample selection; this method selects individuals that are easily accessible and may result in biased data. For example, a marketing...
10.8K
Sampling Plans01:23

Sampling Plans

866
Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...
866
Statistical Software for Data Analysis and Clinical Trials01:12

Statistical Software for Data Analysis and Clinical Trials

1.4K
Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...
1.4K
Random Sampling Method01:09

Random Sampling Method

14.0K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
14.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Adherence to the Mediterranean diet and risk of pancreatic cancer: an analysis of 2.3 million participants in the Pooling Project of Prospective Studies of Diet and Cancer (DCPP).

European journal of epidemiology·2026
Same author

Residential mobility and health among people with hiv: a scoping review.

AIDS (London, England)·2026
Same author

Association of Comorbid and Incident Depression and Other Mental Health Conditions With Long-COVID: Results From the Johns Hopkins COVID Long Study.

Journal of medical virology·2026
Same author

Incidence of frailty-related fracture among Medicaid beneficiaries living with HIV and cancer: A cohort study.

PloS one·2026
Same author

Assessing the causal effect of tobacco retail density on cardiovascular and pulmonary disease hospitalizations in the United States.

Preventive medicine reports·2026
Same author

Sensitive Period Analysis of Adulthood BMI and Cancer Risk: An Individual Participant Data Meta-Analysis of Over 720,000 Participants in the ABACus 2 Consortium.

International journal of cancer·2026
Same journal

Correction to: Home dampness and molds and occurrence of respiratory tract infections in the first 27 years of life: the Espoo Cohort Study.

American journal of epidemiology·2026
Same journal

A SIMPLE AND POWERFUL TEST OF VACCINE WANING.

American journal of epidemiology·2026
Same journal

Association Between maternal body mass index, offspring growth and pubertal timing: results from a longitudinal birth cohort study.

American journal of epidemiology·2026
Same journal

Correction to: Developing a novel algorithm to identify incident and prevalent dementia in Medicare claims-the ARIC Study.

American journal of epidemiology·2026
Same journal

RE: advancing observational research on arts and health: theory-informed approaches using the RADIANCE framework.

American journal of epidemiology·2026
Same journal

Maternal Cesarean Section and Offspring ASD or ADHD Risk: A Nurses' Health Study II Analysis.

American journal of epidemiology·2026
See all related articles

Related Experiment Video

Updated: Jan 9, 2026

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.7K

Sampling for computational efficiency when conducting analyses in big data.

Jacqueline E Rudolph1, Yiyi Zhou1, Maylin Palatino1

  • 1Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States.

American Journal of Epidemiology
|December 4, 2025
PubMed
Summary
This summary is machine-generated.

Sampling methods efficiently estimate lung cancer incidence in big data. Sub-cohort and case-cohort approaches offer faster, less memory-intensive analyses compared to divide-and-recombine, yielding similar results.

Keywords:
big datacase-cohortcomputational efficiencydivide-and-recombinesampling methodssubcohort

More Related Videos

A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

313
Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets
06:40

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Published on: February 23, 2024

1.7K

Related Experiment Videos

Last Updated: Jan 9, 2026

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering
09:43

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

6.7K
A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

313
Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets
06:40

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Published on: February 23, 2024

1.7K

Area of Science:

  • Epidemiology
  • Biostatistics
  • Health Informatics

Background:

  • Big data research presents computational challenges, especially with bias-correction methods.
  • Efficiently analyzing large datasets is crucial for public health research.

Purpose of the Study:

  • To evaluate sampling methods for estimating lung cancer incidence by HIV status in a large cohort.
  • To compare the accuracy, speed, and resource usage of different sampling techniques.

Main Methods:

  • Utilized a cohort of nearly 30 million Medicaid beneficiaries to assess lung cancer incidence.
  • Employed sampling schemes: divide-and-recombine, sub-cohort, and case-cohort, alongside inverse probability weighting for confounder control.
  • Estimated incidence rate ratio (IRR), hazard ratio (HR), and risk ratio (RR).

Main Results:

  • Observed 1113 lung cancer diagnoses in beneficiaries with HIV and 33,106 in those without.
  • Sub-cohort and case-cohort sampling yielded estimates comparable to the full sample.
  • These methods were faster and required less memory than divide-and-recombine, particularly for risk ratio estimation.

Conclusions:

  • Sampling methods, specifically sub-cohort and case-cohort, provide efficient and accurate parameter estimation in big data analyses.
  • These approaches reduce computational burden without significantly compromising results.
  • HIV status may influence lung cancer incidence, warranting further investigation.