Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Significance Testing: Overview

Significance Testing: Overview

Significance testing is a set of statistical methods used to test whether a claim about a parameter is valid. In analytical chemistry, significance testing is used primarily to determine whether the difference between two values comes from determinate or random errors. The effect of a particular change in the measurement protocol, analyst, or sample itself can cause a deviation from the expected result. In the case of a suspected deviation/outlier, we need to be able to confirm mathematically...

Statistical Hypothesis Testing

Statistical Hypothesis Testing

Hypothesis testing is a critical statistical procedure facilitating informed, evidence-based decisions. It begins with a hypothesis, which is a tentative explanation, or a prediction about a population parameter. This hypothesis can be either a null hypothesis (H0), indicating no effect or difference, or an alternative hypothesis (Ha), suggesting an effect or difference.
Statistical significance measures the probability that an observed result occurred by chance. If this probability, known as...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance, comparing...

Sign Test for Matched Pairs

Sign Test for Matched Pairs

The sign test for matched pairs offers a robust method for comparing two paired samples, often for the effects of an intervention in one of them. This method is very useful in situations where the underlying distribution of the data is unknown. The test compares two related samples—often pre- and post-treatment measurements on the same subjects—to determine if there are significant differences in their median values.
To conduct the sign test, we first calculate the differences in value between...

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with data...

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Pelvic Lymph Node Dissection in Prostate Cancer: Update from a Randomized Clinical Trial of Limited Versus Extended Dissection.

European urology·2024

Same author

Successful completion of large, low-cost randomized cancer trials at an academic cancer center.

Clinical trials (London, England)·2024

Same author

Testosterone Therapy in Men After Radical Prostatectomy for Low-Intermediate Organ-Confined Prostate Cancer.

The Journal of urology·2024

Same author

Re: Early Prostate Cancer Deaths Among Men with Higher vs Lower Genetic Risk.

European urology·2024

Same author

Transperineal vs Transrectal Prostate Biopsy-The PREVENT Randomized Clinical Trial.

JAMA oncology·2024

Same author

Clinical utility of an artificial intelligence radiomics-based tool for risk stratification of pulmonary nodules.

JNCI cancer spectrum·2024

Same journal

Methods for incorporating test result information within the high-dimensional propensity score framework: application in UK electronic health record data.

BMC medical research methodology·2026

Same journal

Sparse multi-way DMDC for longitudinal classification in high dimension low sample size data.

BMC medical research methodology·2026

Same journal

Tree-based exploratory identification of predictive biomarkers in non-randomized data.

BMC medical research methodology·2026

Same journal

Comparative evaluation of interrupted time series analytical methods for healthcare quality improvement research: a Monte Carlo simulation study.

BMC medical research methodology·2026

Same journal

Methodological advances in claims-based dementia algorithms: integrating medication and clinical data for medicare populations.

BMC medical research methodology·2026

Same journal

An interpretable XGboost algorithm for predicting 30-day mortality in acute pancreatitis using routine biomarkers.

BMC medical research methodology·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 4, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

One statistical test is sufficient for assessing new predictive markers.

Andrew J Vickers¹, Angel M Cronin, Colin B Begg

¹Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, Box 44, New York, NY 10065 USA. vickersa@mskcc.org

BMC Medical Research Methodology

|February 1, 2011

Summary

This summary is machine-generated.

Comparing the area under the receiver operating characteristic curve (AUC) to assess new disease prediction models is less powerful than regression-based tests. Regression models are preferred for evaluating predictor significance, avoiding discordant conclusions.

Related Experiment Videos

Last Updated: Jun 4, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

Biostatistics
Medical Informatics
Epidemiology

Background:

The area under the receiver operating characteristic curve (AUC) is increasingly used to evaluate novel disease predictors.
Investigators often use a two-stage approach: significance testing in regression models, then comparing AUCs with and without the predictor.
These distinct methods can yield conflicting conclusions regarding predictor utility.

Purpose of the Study:

To compare the statistical properties of comparing AUCs versus regression-based tests for evaluating novel predictors in multivariable models.
To assess the performance of likelihood ratio and Wald tests against AUC comparison for predictor significance.

Main Methods:

A simulation study was conducted using logistic regression.
Two predictors, X and X*, were generated with varying predictive strengths.
Likelihood ratio, Wald tests, and AUC comparisons were performed to evaluate incremental predictor contribution.

Main Results:

Regression-based tests (likelihood ratio and Wald) showed sizes close to nominal under the null hypothesis.
The AUC comparison test was extremely conservative, with sizes below 0.006 across all configurations.
The AUC test demonstrated significantly lower power than regression tests when the predictor was associated with the outcome.

Conclusions:

Regression modeling is the most appropriate method for evaluating the statistical significance of new predictors.
While conceptually similar, AUC comparison has inferior statistical properties compared to regression-based tests.
Using both methods concurrently often leads to inconsistent findings; AUC remains useful for descriptive, initial evaluations.