Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Parametric Survival Analysis: Weibull and Exponential Methods01:14

Parametric Survival Analysis: Weibull and Exponential Methods

556
Parametric survival analysis models survival data by assuming a specific probability distribution for the time until an event occurs. The Weibull and exponential distributions are two of the most commonly used methods in this context, due to their versatility and relatively straightforward application.
Weibull Distribution
The Weibull distribution is a flexible model used in parametric survival analysis. It can handle both increasing and decreasing hazard rates, depending on its shape parameter...
556
Assumptions of Survival Analysis01:15

Assumptions of Survival Analysis

175
Survival models analyze the time until one or more events occur, such as death in biological organisms or failure in mechanical systems. These models are widely used across fields like medicine, biology, engineering, and public health to study time-to-event phenomena. To ensure accurate results, survival analysis relies on key assumptions and careful study design.
175
Survival Tree01:19

Survival Tree

138
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
138
Comparing the Survival Analysis of Two or More Groups01:20

Comparing the Survival Analysis of Two or More Groups

259
Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...
259
Introduction To Survival Analysis01:18

Introduction To Survival Analysis

352
Survival analysis is a statistical method used to study time-to-event data, where the "event" might represent outcomes like death, disease relapse, system failure, or recovery. A unique feature of survival data is censoring, which occurs when the event of interest has not been observed for some individuals during the study period. This requires specialized techniques to handle incomplete data effectively.
The primary goal of survival analysis is to estimate survival time—the time...
352
Truncation in Survival Analysis01:09

Truncation in Survival Analysis

281
Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are...
281

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Unobserved heterogeneity in threshold regression based on the hitting times of a reflected Brownian motion for recurrent hypoglycemia.

Lifetime data analysis·2026
Same author

CALF-SBM: A covariate-assisted latent factor stochastic block model.

Physica A·2026
Same author

Recurrent events modeling based on a reflected Brownian motion with application to hypoglycemia.

Biostatistics (Oxford, England)·2025
Same author

On GEE for Mean-Variance-Correlation Models: Variance Estimation and Model Selection.

Statistics in medicine·2024
Same author

Optimal subsampling for semi-parametric accelerated failure time models with massive survival data using a rank-based approach.

Statistics in medicine·2024
Same author

Regression Modeling for Recurrent Events Possibly with an Informative Terminal Event Using R Package reReg.

Journal of statistical software·2024
Same journal

Interpretable Bayesian Modeling for Multireader Multicase Studies: Addressing Overdispersion and Limited Sample Size in Diagnostic Enhancement Evaluation.

Statistics in medicine·2026
Same journal

Adaptive Sequential Multiple Hypotheses Testing for Concomitant Vaccine Safety Surveillance.

Statistics in medicine·2026
Same journal

Novel Distance Regression for Repeated Outcomes With Missing Data: Applications to Longitudinal and Crossover Studies of Microbiome Beta-Diversity.

Statistics in medicine·2026
Same journal

Optimal Weighted Tests for Replication Studies and the 'Two-Trials Rule' With Multiple Hypotheses.

Statistics in medicine·2026
Same journal

Identifiable Copula-Double-Cox Models: A Fully Parametric Framework for Dependent Right-Censored Survival Data.

Statistics in medicine·2026
Same journal

Moving From Individualized Risk-Based Prevention to Benefit-Based Prevention: Estimating Individualized Life-Years Gained From Prevention Services as a Basis for Eligibility.

Statistics in medicine·2026
See all related articles

Related Experiment Video

Updated: Aug 28, 2025

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.3K

Optimal subsampling for parametric accelerated failure time models with massive survival data.

Zehan Yang1, HaiYing Wang1, Jun Yan1

  • 1Department of Statistics, University of Connecticut, Storrs, Connecticut.

Statistics in Medicine
|September 21, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces optimal subsampling for massive survival data, enabling faster computation of survival models. The new method provides valid statistical inferences without memory limitations.

Keywords:
A-optimalityL-optimalitycensoringsurvival analysis

More Related Videos

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.2K
Cutoff Value of Phase Angle by Bioelectrical Impedance Analysis at Admission as a Prognostic Factor in Patients with Acute Heart Failure
05:16

Cutoff Value of Phase Angle by Bioelectrical Impedance Analysis at Admission as a Prognostic Factor in Patients with Acute Heart Failure

Published on: June 10, 2025

201

Related Experiment Videos

Last Updated: Aug 28, 2025

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.3K
An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.2K
Cutoff Value of Phase Angle by Bioelectrical Impedance Analysis at Admission as a Prognostic Factor in Patients with Acute Heart Failure
05:16

Cutoff Value of Phase Angle by Bioelectrical Impedance Analysis at Admission as a Prognostic Factor in Patients with Acute Heart Failure

Published on: June 10, 2025

201

Area of Science:

  • Biostatistics
  • Statistical Computing
  • Machine Learning

Background:

  • Massive survival datasets necessitate computationally efficient statistical inference methods.
  • Existing approaches like online updating and divide-and-conquer are limited, particularly for relative risk models.
  • Subsampling strategies are underdeveloped for semiparametric models with censored data due to asymptotic property challenges.

Purpose of the Study:

  • To develop optimal subsampling algorithms for fast approximation of the maximum likelihood estimator (MLE) in parametric accelerate failure time (AFT) models.
  • To address computational limitations posed by massive survival data.
  • To enable valid statistical inferences for large-scale survival analysis.

Main Methods:

  • Developed optimal subsampling algorithms for parametric AFT models.
  • Derived asymptotic distributions for the subsampling estimator and optimal sampling probabilities.
  • Proposed a feasible two-step algorithm to estimate optimal sampling probabilities using a pilot sample.
  • Established asymptotic properties of the two-step estimator.

Main Results:

  • The proposed subsampling approach effectively approximates the MLE for massive survival data.
  • The derived optimal sampling probabilities minimize the asymptotic mean squared error (AMSE).
  • The two-step estimator demonstrates established asymptotic properties and reliable performance.
  • Simulation studies and real data analysis validate the method's effectiveness.

Conclusions:

  • Optimal subsampling provides a computationally efficient and statistically valid method for survival modeling with massive datasets.
  • The proposed two-step algorithm is practical for estimating optimal sampling probabilities.
  • This work advances statistical inference for large-scale survival data analysis, overcoming memory constraints.