Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Censoring Survival Data01:09

Censoring Survival Data

Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different reasons...
Assumptions of Survival Analysis01:15

Assumptions of Survival Analysis

Survival models analyze the time until one or more events occur, such as death in biological organisms or failure in mechanical systems. These models are widely used across fields like medicine, biology, engineering, and public health to study time-to-event phenomena. To ensure accurate results, survival analysis relies on key assumptions and careful study design.
Kaplan-Meier Approach01:24

Kaplan-Meier Approach

The Kaplan-Meier estimator is a non-parametric method used to estimate the survival function from time-to-event data. In medical research, it is frequently employed to measure the proportion of patients surviving for a certain period after treatment. This estimator is fundamental in analyzing time-to-event data, making it indispensable in clinical trials, epidemiological studies, and reliability engineering. By estimating survival probabilities, researchers can evaluate treatment effectiveness,...
Introduction To Survival Analysis01:18

Introduction To Survival Analysis

Survival analysis is a statistical method used to study time-to-event data, where the "event" might represent outcomes like death, disease relapse, system failure, or recovery. A unique feature of survival data is censoring, which occurs when the event of interest has not been observed for some individuals during the study period. This requires specialized techniques to handle incomplete data effectively.
The primary goal of survival analysis is to estimate survival time—the time until a...
Longitudinal Studies01:26

Longitudinal Studies

Longitudinal studies are also widely used in other medical and social science fields. For instance, in cardiovascular research, they can monitor patients' health over decades to identify risk factors for heart disease, such as high cholesterol or smoking, and evaluate the long-term effectiveness of preventive measures. Similarly, in mental health studies, researchers might follow individuals from adolescence into adulthood to understand the development and progression of conditions like...
Comparing the Survival Analysis of Two or More Groups01:20

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and Cox...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

High performance deep-learning model for the diagnosis of auto-immune hepatitis based on histological whole slide images.

Virchows Archiv : an international journal of pathology·2026
Same author

Venous and Arterial Thrombo-Embolic Events in Patients With von Willebrand Disease From Western France: The TWIGO Study.

Haemophilia : the official journal of the World Federation of Hemophilia·2026
Same author

Development and Validation of a Multi-Modal Algorithm for Chronic Kidney Disease Detection in a Hospital Clinical Data Warehouse.

Studies in health technology and informatics·2026
Same author

OPTIMA-DAW: Improving Cerebral Vasospasm Detection After Aneurysmal Subarachnoid Haemorrhage Using Machine Learning.

Studies in health technology and informatics·2026
Same author

A Durable Backdoor Attack on Medical Imaging via Federated Learning.

Studies in health technology and informatics·2026
Same author

Implementing a Semi-Automated Method for Surgical Site Infections Monitoring in a Limited Setting: The SPICMI Method in Martinique University Hospital.

Studies in health technology and informatics·2026
Same journal

The Essential Components and Critical Conditions for Success in a Learning Health System in Oncology.

Studies in health technology and informatics·2026
Same journal

Use of Artificial Intelligence in Screening for Adolescent Idiopathic Scoliosis: A Scoping Review.

Studies in health technology and informatics·2026
Same journal

Movement Related Biomechanics in Adolescent Idiopathic Scoliosis: A Review of Reviews.

Studies in health technology and informatics·2026
Same journal

The Impact of Surgical Correction of Adolescent Idiopathic Scoliosis Using Posterior Spinal Fusion on Selected Radiological Parameters and Respiratory Function.

Studies in health technology and informatics·2026
Same journal

Acute Effect of Physio-logic® Exercises on Muscle Tone and Stiffness in Adolescent Idiopathic Scoliosis Patients: A Preliminary Study.

Studies in health technology and informatics·2026
Same journal

Effects of Integrated Music and Occupational Therapy on Motor and Autonomic Function in Children with Neurogenic Scoliosis.

Studies in health technology and informatics·2026
See all related articles

Related Experiment Video

Updated: May 24, 2026

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Qualifying Missingness in Real-World Clinical Data for Secondary Use.

Pauline Fracasso1, Morgane Pierre-Jean1, Gouenou Coatrieux2

  • 1Univ Rennes, CHU Rennes, INSERM, LTSI-UMR 1099, F-35000 Rennes, France.

Studies in Health Technology and Informatics
|May 23, 2026
PubMed
Summary
This summary is machine-generated.

Characterizing missing data in clinical data warehouses (CDWs) using descriptors reveals data quality. This method helps understand real-world data patterns and assess dataset integrity effectively.

Keywords:
clinical data warehouse (CDW)descriptorsmissing at random (MAR)missing completely at random (MCAR)missing datamissing not at random (MNAR)scenarios

More Related Videos

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

Related Experiment Videos

Last Updated: May 24, 2026

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

Area of Science:

  • Data Science
  • Health Informatics
  • Biostatistics

Background:

  • Clinical data warehouses (CDWs) frequently encounter missing data due to unsystematic collection practices.
  • Understanding the characteristics of missing data is crucial for assessing data quality and interpreting real-world data.
  • Existing methods for qualifying clinical data require systematic evaluation for their effectiveness.

Purpose of the Study:

  • To identify and evaluate methods for characterizing missing data in clinical data warehouses.
  • To develop and implement an automated pipeline for extracting data quality descriptors.
  • To assess the utility of these descriptors in reflecting different missing data scenarios.

Main Methods:

  • A scoping review identified relevant data qualification methods for clinical data.
  • An automated pipeline was developed to extract data quality descriptors.
  • Four distinct missing data scenarios were simulated from a complete dataset for evaluation.
  • Statistical tests (Friedman, Cochran's Q) were used to analyze descriptor variability across scenarios.

Main Results:

  • 230 descriptors were extracted per incomplete dataset.
  • 37.4% of extracted descriptors showed statistically significant differences across simulated scenarios after correction.
  • The most effective descriptors accurately mirrored the underlying structures of the simulated missing data patterns.

Conclusions:

  • Descriptor-based qualification provides interpretable insights into data quality within clinical data warehouses.
  • This approach aids in understanding the nature and extent of missingness in real-world clinical data.
  • Automated extraction and analysis of these descriptors enhance the assessment of CDW data integrity.