Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Testing a Claim about Population Proportion01:24

Testing a Claim about Population Proportion

3.9K
A complete procedure for testing a claim about a population proportion is provided here.
There are two methods of testing a claim about a population proportion: (1) Using the sample proportion from the data where a binomial distribution is approximated to the normal distribution and (2) Using the binomial probabilities calculated from the data.
The first method uses normal distribution as an approximation to the binomial distribution. The requirements are as follows: sample size is large...
3.9K
Strategies for Assessing and Addressing Confounding01:25

Strategies for Assessing and Addressing Confounding

343
Confounding is a critical issue in epidemiological studies, often leading to misleading conclusions about associations between exposures and outcomes. It occurs when the relationship between the exposure and the outcome is mixed with the effects of other factors that influence the outcome. Given that, addressing confounding is of high importance for drawing accurate inferences in research.
Confounding can be addressed at both the design phase of a study and through analytical methods after data...
343
Types of Biopharmaceutical Studies: Controlled and Non-Controlled Approaches01:23

Types of Biopharmaceutical Studies: Controlled and Non-Controlled Approaches

387
Biopharmaceutical studies constitute a vital field aiming to enhance drug delivery methods and refine therapeutic approaches, drawing upon diverse interdisciplinary knowledge. In research methodologies, the choice between controlled and non-controlled studies significantly influences the study's reliability and accuracy.
Non-controlled studies, commonly employed for initial exploration, lack a control group, rendering them susceptible to biases and external influences. In contrast,...
387
Study Design in Statistics01:15

Study Design in Statistics

9.9K
A study design is a set of techniques that allow a researcher to collect and analyze data from different variables defined for a specific research problem. Statistics is commonly for effective study design and more robust experiments,
Does aspirin reduce the risk of heart attacks? Is one brand of fertilizer more effective at growing roses than another? Is fatigue as dangerous to a driver as the influence of alcohol? Questions like these are answered using randomized experiments with proper...
9.9K
Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

885
Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
885
Testing a Claim about Mean: Known Population SD01:11

Testing a Claim about Mean: Known Population SD

3.2K
A complete procedure of testing the hypothesis about a population mean is explained here.
Estimating a population mean requires the samples to be distributed normally. The data should be collected from the randomly selected samples having no sampling bias. The sample size needed to be higher than 30, and most importantly, the population standard deviation should be already known.
In most realistic situations, the population standard deviation is often unknown, but in rare circumstances, when it...
3.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Assessing the association of physical distancing to avoid COVID-19 with health-related quality of life in immunocompromised adults: results from the cross-sectional observational EAGLE study.

BMJ public health·2026
Same author

Comparison of Hospitalization Costs Associated With Human Metapneumovirus and Respiratory Syncytial Virus Infection in US Adults.

Open forum infectious diseases·2026
Same author

Prediction of COVID-19 hospitalisation, ICU admission or death following ChAdOx1 vaccination using artificial intelligence: A clinical predictive model from the English RAVEN study.

PloS one·2026
Same author

Filling the Gaps in Health Data: Using a Machine Learning Approach to Augment Partially Observed Variables Such as Smoking in Claims Data.

Pharmacoepidemiology and drug safety·2026
Same author

Comparison of Healthcare Resource Utilization and Disease Outcomes in Adults Hospitalized with Human Metapneumovirus and Respiratory Syncytial Virus.

The Journal of infectious diseases·2025
Same author

Correcting for the Inflated Adult Population Denominator in an English Nationwide Health Care Cohort: Database Analysis Study.

JMIR public health and surveillance·2025
Same journal

Effectiveness of Metformin in Preventing Colorectal Cancer Among Japanese Patients With Type 2 Diabetes: A Target Trial Emulation.

Pharmacoepidemiology and drug safety·2026
Same journal

Trends in Pharmacist-Prescribed Dispensing Records of HIV Pre-Exposure (2020-2025) and Post-Exposure Prophylaxis (2020-2024) in Brazil: A Time Series Analysis.

Pharmacoepidemiology and drug safety·2026
Same journal

French Consumption of Methylphenidate in Primary Care From 2016 to 2023, Impact of Prescribing Policy Changes-A Time-Series Analysis.

Pharmacoepidemiology and drug safety·2026
Same journal

Uptake and Use of Biologic Therapies in Paediatric Immune-Mediated Inflammatory Diseases: An Australian Population-Based Study.

Pharmacoepidemiology and drug safety·2026
Same journal

Comparative Effectiveness of Oral Fluoropyrimidines Versus FOLFOX as Adjuvant Therapy for Stage III Colon Cancer: A Retrospective Cohort Study Using Overlap-Weighted Restricted Mean Survival Time Analysis.

Pharmacoepidemiology and drug safety·2026
Same journal

Association Between EGFR-TKI-Associated Skin Rash and Recorded Mortality in Non-Small Cell Lung Cancer: A Real-World Analysis Accounting for Immortal Time Bias.

Pharmacoepidemiology and drug safety·2026
See all related articles

Related Experiment Video

Updated: Jan 10, 2026

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

15.0K

Dimensionality Reduction Techniques for Improving Propensity Score Specification: An Application to a Cohort Study

Sudhir Venkatesan1, Jonatan Nåtman2, Eva Lesén3

  • 1BPM Evidence Statistics, BioPharmaceuticals Medical, AstraZeneca, Cambridge, UK.

Pharmacoepidemiology and Drug Safety
|November 24, 2025
PubMed
Summary
This summary is machine-generated.

Dimensionality reduction techniques, especially autoencoders, improve covariate balance for propensity score estimation in large healthcare databases. These methods offer better confounding control in pharmacoepidemiological studies compared to traditional approaches.

Keywords:
autoencodersconfoundingcovariate balancedimensionality reductionhigh‐dimensional datapropensity scores

More Related Videos

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

16.1K
Influence of Emotional Factors on the Efficacy of Acupuncture Treatment for Overweight Complicated with Hyperlipidemia: A Retrospective Cohort Study
03:05

Influence of Emotional Factors on the Efficacy of Acupuncture Treatment for Overweight Complicated with Hyperlipidemia: A Retrospective Cohort Study

Published on: November 21, 2025

463

Related Experiment Videos

Last Updated: Jan 10, 2026

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

15.0K
Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

16.1K
Influence of Emotional Factors on the Efficacy of Acupuncture Treatment for Overweight Complicated with Hyperlipidemia: A Retrospective Cohort Study
03:05

Influence of Emotional Factors on the Efficacy of Acupuncture Treatment for Overweight Complicated with Hyperlipidemia: A Retrospective Cohort Study

Published on: November 21, 2025

463

Area of Science:

  • Pharmacoepidemiology
  • Biostatistics
  • Health Informatics

Background:

  • Propensity score (PS) estimation is crucial for confounding control in observational studies.
  • High-dimensional data presents challenges for traditional PS estimation methods.
  • Effective covariate balance is essential for valid causal inference.

Purpose of the Study:

  • To evaluate dimensionality reduction techniques for propensity score estimation in high-dimensional claims data.
  • To compare the performance of these techniques against conventional methods for covariate balance and confounding control.
  • To assess the utility of autoencoders, principal component analysis (PCA), and logistic PCA in pharmacoepidemiological research.

Main Methods:

  • A cohort study used claims data to investigate dialysis and mortality in older patients with heart failure and chronic kidney disease.
  • Propensity scores were estimated using investigator-specified covariates, high-dimensional propensity score (hdPS), principal component analysis (PCA), logistic PCA, and autoencoders.
  • Covariate balance was assessed using standardized mean differences (SMD) and propensity score overlap plots.

Main Results:

  • Autoencoder-based PS achieved the best covariate balance, with only 8 covariates exceeding an SMD of 0.1.
  • Principal component analysis (PCA) and logistic PCA also showed improved balance compared to hdPS and investigator-specified covariates.
  • Hazard ratios for in-hospital mortality were comparable across all propensity score estimation methods.

Conclusions:

  • Dimensionality reduction techniques, particularly autoencoders, demonstrate superior performance in achieving covariate balance for propensity score estimation in high-dimensional claims data.
  • These advanced methods hold promise for enhancing the validity of propensity score-matched designs in large-scale pharmacoepidemiological studies.
  • Autoencoders offer a powerful approach to improve confounding control and covariate balance in real-world evidence research.