Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Relative Risk

Relative Risk

Relative risk (RR) is a statistical measure commonly used in epidemiology to compare the likelihood of a particular event occurring between two groups. This metric is important for evaluating the relationship between exposure to a specific risk factor and the probability of a particular outcome. It plays a crucial role in medical research, public health studies, and risk assessment. Relative risk quantifies how much more (or less) likely an event is to occur in an exposed group compared to an...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

Extraction: Partition and Distribution Coefficients

Extraction: Partition and Distribution Coefficients

The distribution law or Nernst's distribution law is the law that governs the distribution of a solute between two immiscible solvents. This law, also known as the partition law, states that if a solute is added to the mixture of two immiscible solvents at a constant temperature, the solute is distributed between the two solvents in such a way that the ratio of solute concentrations in the solvents remains constant at equilibrium.
For extracting a solute from an aqueous phase into an...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Artificial intelligence in clinical trial participant recruitment and retention: A scoping review and meta-analysis.

Journal of clinical and translational science·2026

Same author

Statistics and AI - A Fireside Conversation.

Harvard data science review·2026

Same author

Predicting the timing of first sustained cognitive worsening in Alzheimer's disease using real-world clinical data and machine learning.

medRxiv : the preprint server for health sciences·2026

Same author

Strategies for mitigating artificial intelligence bias in healthcare: a systematic review.

JAMIA open·2026

Same author

Defining Prenatal Care Surveillance Metrics Using Electronic Health Record Data.

JAMA health forum·2026

Same author

Cardiovascular Disease Risk and Noncardiovascular Chronic Disease Burden by Housing Status.

Journal of the American Heart Association·2026

Same journal

A Bayesian functional concurrent zero-inflated Dirichlet-multinomial regression model with application to infant microbiome.

Biostatistics (Oxford, England)·2026

Same journal

Towards optimal environmental policies: policy learning under arbitrary bipartite network interference.

Biostatistics (Oxford, England)·2026

Same journal

Multilevel functional quantile principal component analysis.

Biostatistics (Oxford, England)·2026

Same journal

Adaptive transfer learning for time-to-event modeling with applications in disease risk assessment.

Biostatistics (Oxford, England)·2026

Same journal

High-dimensional test for one-sided hypotheses.

Biostatistics (Oxford, England)·2026

Same journal

NBSR: a Negative Binomial Softmax Regression model for microRNA-seq data analysis.

Biostatistics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 9, 2025

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

A divide-and-conquer method for sparse risk prediction and evaluation.

Chuan Hong¹, Yan Wang¹, Tianxi Cai²

¹Department of Biomedical Informatics, Harvard Medical School, Boston, 02115, MA, USA.

Biostatistics (Oxford, England)

|September 10, 2020

Summary

This summary is machine-generated.

A new SOLID algorithm and modified cross-validation (MCV) efficiently analyze massive datasets for sparse logistic regression. These methods offer faster computation and accurate risk prediction inference, outperforming existing divide-and-conquer approaches.

Keywords:

L 1 regularization Divide-and-conquer Least square approximation Logistic regression Prediction accuracy Predictive modeling Variable selection

More Related Videos

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Related Experiment Videos

Last Updated: Dec 9, 2025

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Area of Science:

Statistical modeling
Machine learning
Bioinformatics

Background:

Divide-and-conquer (DAC) algorithms are used for large datasets but can be computationally intensive.
Existing DAC methods for sparse regression lack inference capabilities for risk prediction accuracy.
Massive datasets require efficient algorithms for sparse predictive modeling.

Purpose of the Study:

To propose a computationally efficient algorithm, SOLID, for fitting sparse logistic regression to massive datasets.
To develop a modified cross-validation (MCV) procedure for accurate risk prediction model assessment.
To enable inference on the accuracy of predictive models using a novel approach.

Main Methods:

Developed the screening and one-step linearization infused DAC (SOLID) algorithm.
Integrated screening and linearization within the DAC framework for penalized estimation.
Introduced modified cross-validation (MCV) utilizing SOLID's intermediate results for computational efficiency.

Main Results:

SOLID and MCV significantly outperform existing DAC methods in computational speed.
The proposed methods achieve statistical efficiency comparable to full sample-based estimators.
MCV is the first DAC procedure to provide valid inference on predictive model accuracy.

Conclusions:

SOLID and MCV offer a computationally efficient and statistically sound approach for analyzing massive datasets in sparse logistic regression.
The developed inference procedure provides valid interval estimators for risk prediction accuracy.
The SOLID procedure was successfully applied to a clinical notes-based disease diagnosis classification model.