Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Sample Size Calculation

Sample Size Calculation

Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Survival Curves

Survival Curves

Survival curves are graphical representations that depict the survival experience of a population over time, offering an intuitive way to track the proportion of individuals who remain event-free at each time point. These curves are widely used in fields such as medicine, public health, and reliability engineering to visualize and compare survival probabilities across different groups or conditions.
The Kaplan-Meier estimator is the most common method for constructing survival curves. This...

Calibration Curves: Linear Least Squares

Calibration Curves: Linear Least Squares

A calibration curve is a plot of the instrument's response against a series of known concentrations of a substance. This curve is used to set the instrument response levels, using the substance and its concentrations as standards. Alternatively, or additionally, an equation is fitted to the calibration curve plot and subsequently used to calculate the unknown concentrations of other samples reliably.
For data that follow a straight line, the standard method for fitting is the linear...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Reply to: Interpreting Circulating Tumor DNA-Guided Risk Stratification After First-Line Therapy.

Journal of clinical oncology : official journal of the American Society of Clinical Oncology·2026

Same author

Evaluating progesterone receptor agonist megestrol plus letrozole for women with early-stage estrogen-receptor-positive breast cancer: the window-of-opportunity, randomized, phase 2b, PIONEER trial.

Nature cancer·2026

Same author

Phased Variant-Supported Circulating Tumor DNA as a Prognostic Biomarker After First-Line Treatment in Large B-Cell Lymphoma: Findings From the DIRECT Study.

Journal of clinical oncology : official journal of the American Society of Clinical Oncology·2025

Same author

Stable survival extrapolation using mortality projections.

Biometrics·2025

Same author

Short-term animal product dietary restriction alters metabolic profiles and modulates immune function.

Communications medicine·2025

Same author

Seamless monotherapy-combination phase I dose-escalation model-based design.

Clinical trials (London, England)·2025

Same journal

A Causal Framework for Evaluating the Total Effect of Strategies Aiming to Expand Screening and to Improve Outcomes.

Statistics in medicine·2026

Same journal

Causal Effects on Nonterminal Event Time With Application to Antibiotic Usage and Future Resistance.

Statistics in medicine·2026

Same journal

Subgroup Analysis of Interval-censored Failure Time Data With Application to Alzheimer's Disease.

Statistics in medicine·2026

Same journal

Rejoinder to Commentaries on "A Perspective on the Appropriate Implementation of ICH E9(R1) Addendum Strategies for Handling Intercurrent Events".

Statistics in medicine·2026

Same journal

A Multi-Stage Drop-the-Loser Design With Superiority Boundaries.

Statistics in medicine·2026

Same journal

Interpretable ROI Identification in Brain Image Analysis: Overcoming CNN Black Box Challenges With Kriging-Enhanced Adaptive Sampling.

Statistics in medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 25, 2025

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Published on: January 13, 2023

Sample size determination for prediction models via learning-type curves.

Alimu Dayimu¹, Nikola Simidjievski^2,3, Nikolaos Demiris⁴

¹Cambridge Clinical Trials Unit Cancer Theme, University of Cambridge, Cambridge, UK.

Statistics in Medicine

|May 28, 2024

Summary

This summary is machine-generated.

This study introduces learning curves to improve sample size calculations for prediction models. Borrowing information across sample sizes enhances prediction model performance and robustness.

Keywords:

Gaussian process extrapolation learning curve sample size estimation statistical design

More Related Videos

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Modeling the Size Spectrum for Macroinvertebrates and Fishes in Stream Ecosystems

Modeling the Size Spectrum for Macroinvertebrates and Fishes in Stream Ecosystems

Published on: July 30, 2019

Related Experiment Videos

Last Updated: Jun 25, 2025

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Published on: January 13, 2023

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Modeling the Size Spectrum for Macroinvertebrates and Fishes in Stream Ecosystems

Modeling the Size Spectrum for Macroinvertebrates and Fishes in Stream Ecosystems

Published on: July 30, 2019

Area of Science:

Statistics
Machine Learning
Biostatistics

Background:

Sample size determination is crucial for the reliability of prediction models.
Existing methods often lack robustness and efficiency, especially when dealing with limited data or extrapolating findings.

Purpose of the Study:

To develop and evaluate novel methodologies for sample size determination in prediction modeling.
To enhance the performance and statistical efficiency of sample size calculations by leveraging learning curves.

Main Methods:

Proposing two methods: a deterministic learning curve skeleton and a Gaussian process model built upon it.
Utilizing various learning algorithms for primary endpoint modeling and distinct efficacy measures.
Illustrating the methods with binary and survival endpoints.

Main Results:

Combining individual sample size calculations via learning curves universally improves performance.
The Gaussian process-based learning curve demonstrates superior robustness and statistical efficiency.
Computational efficiency is comparable between the proposed methods.

Conclusions:

Learning curves effectively integrate information across different sample sizes for more reliable sample size determination.
Anchoring sample size extrapolations against historical data is recommended when available.
The Gaussian process approach offers a statistically sound and efficient solution for sample size planning in prediction modeling.