Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multiple Regression01:25

Multiple Regression

3.1K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.1K
Sample Size Calculation01:19

Sample Size Calculation

3.6K
Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of...
3.6K
Survival Tree01:19

Survival Tree

131
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
131
Prediction Intervals01:03

Prediction Intervals

2.3K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
2.3K
One-Way ANOVA: Equal Sample Sizes01:15

One-Way ANOVA: Equal Sample Sizes

3.4K
One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...
3.4K
Binomial Probability Distribution01:15

Binomial Probability Distribution

11.3K
A binomial distribution is a probability distribution for a procedure with a fixed number of trials, where each trial can have only two outcomes.
The outcomes of a binomial experiment fit a binomial probability distribution. A statistical experiment can be classified as a binomial experiment if the following conditions are met:
There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.
There are only two possible outcomes,...
11.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Generative AI-enabled clinical decision support system in primary care: a pragmatic, cluster-randomized trial.

Nature medicine·2026
Same author

One-year and 5-year transition risks between major bleeding, reinfarction and death following Acute Myocardial Infarction in England and Wales: A population-based cohort study.

European heart journal. Acute cardiovascular care·2026
Same author

Body mass index, adjuvant chemotherapy, toxicity, and survival in non-metastatic colorectal cancer: an individual participant data meta-analysis (OCTOPUS).

British journal of cancer·2026
Same author

Critical appraisal of fairness metrics for artificial intelligence-based clinical prediction models: a scoping review.

The Lancet. Digital health·2026
Same author

The Lancet Commission on precision health: equitable, data-driven health outcomes for all.

Lancet (London, England)·2026
Same author

Agreement between heuristic shrinkage factor and optimal shrinkage factors in logistic regression for risk prediction: a simulation study across different sample sizes and settings.

Diagnostic and prognostic research·2026
Same journal

A joint model for a longitudinal outcome and a progressive multistate model under a mixed observation scheme.

Statistical methods in medical research·2026
Same journal

Efficient semi-supervised estimation of optimal individualized treatment regimes with survival outcome.

Statistical methods in medical research·2026
Same journal

Asymptotic online FWER control for dependent test statistics.

Statistical methods in medical research·2026
Same journal

Regression analysis of misclassified current status data with potentially unknown test accuracy.

Statistical methods in medical research·2026
Same journal

Bayesian multivariate linear mixed-effects models with varied association structures.

Statistical methods in medical research·2026
Same journal

Inference about the ratio of age-standardized rates between two overlapping populations.

Statistical methods in medical research·2026
See all related articles

Related Experiment Video

Updated: Aug 13, 2025

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.3K

Minimum sample size for developing a multivariable prediction model using multinomial logistic regression.

Alexander Pate1, Richard D Riley2, Gary S Collins3,4

  • 1Division of Informatics, Imaging and Data Science, Faculty of Biology, Medicine and Health, 5292University of Manchester, Manchester Academic Health Science Centre, Manchester, UK.

Statistical Methods in Medical Research
|January 20, 2023
PubMed
Summary
This summary is machine-generated.

Researchers can now determine appropriate sample sizes for multinomial logistic regression models using three new criteria. These criteria help minimize overfitting and ensure precise risk estimation for outcomes with multiple categories.

Keywords:
Clinical prediction modelsmultinomial logistic regressionsample sizeshrinkage

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K
Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma
04:09

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

8.3K

Related Experiment Videos

Last Updated: Aug 13, 2025

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.3K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K
Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma
04:09

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

8.3K

Area of Science:

  • Statistics
  • Biostatistics
  • Epidemiology

Background:

  • Multinomial logistic regression models predict outcomes with more than two categories.
  • Determining adequate sample size (n) is crucial for model validity, considering events (Ek) and predictors (pk).
  • Existing sample size criteria are primarily for binary outcomes.

Purpose of the Study:

  • Propose three criteria for minimum sample size in multinomial logistic regression.
  • Address sample size determination for models with >2 outcome categories.
  • Extend existing criteria for binary outcomes to multinomial settings.

Main Methods:

  • Developed three criteria for sample size calculation.
  • Criterion (i) focuses on minimizing model overfitting using Cox-Snell R² from sub-models.
  • Criteria (ii) and (iii) extend existing methods for binary outcomes.

Main Results:

  • Criterion (i) was validated via simulation, confirming its ability to control overfitting.
  • Simulation showed sample size must be based on Cox-Snell R² of distinct 'one-to-one' models.
  • Criteria (ii) and (iii) are direct extensions and did not require simulation.

Conclusions:

  • The proposed criteria provide a framework for sample size determination in multinomial logistic regression.
  • Implementation is demonstrated with a worked example for ovarian tumor type prediction.
  • The criteria will be integrated into the pmsampsize R library and Stata modules.