Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Controls in Experiments01:13

Controls in Experiments

13.7K
When conducting an experiment, it is crucial to have control to reduce bias and accurately measure the dependent variables. It also marks the results more reliable. Controls are elements in an experiment that have the same characteristics as the treatment groups but are not affected by the independent variable. By sorting these data into control and experimental conditions, the relationship between the dependent and independent variables can be drawn. A randomized experiment always includes a...
13.7K
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

334
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
334
Decision Making: Traditional Method01:14

Decision Making: Traditional Method

4.3K
The process of hypothesis testing based on the traditional method includes calculating the critical value, testing the value of the test statistic using the sample data, and interpreting these values.
First, a specific claim about the population parameter is decided based on the research question and is stated in a simple form. Further, an opposing statement to this claim is also stated. These statements can act as null and alternative hypotheses, out of which a null hypothesis would be a...
4.3K
Study Design in Statistics01:15

Study Design in Statistics

9.6K
A study design is a set of techniques that allow a researcher to collect and analyze data from different variables defined for a specific research problem. Statistics is commonly for effective study design and more robust experiments,
Does aspirin reduce the risk of heart attacks? Is one brand of fertilizer more effective at growing roses than another? Is fatigue as dangerous to a driver as the influence of alcohol? Questions like these are answered using randomized experiments with proper...
9.6K
Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test01:09

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

3.1K
In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...
3.1K
Bonferroni Test01:10

Bonferroni Test

2.9K
The Bonferroni test is a statistical test named after Carlo Emilio Bonferroni, an Italian mathematician best known for Bonferroni inequalities. This statistical test is a type of multiple comparison test to determine which means are different than the rest. Bonferroni test can minimize the Type 1 error by reducing the significance level alpha, which otherwise increases with sample pairs.
The means of different samples are first paired in all possible combinations.
The null hypothesis of the...
2.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

keju: powerful and accurate inference in Massively Parallel Reporter Assays.

bioRxiv : the preprint server for biology·2026
Same author

Clinical trials for continuously monitored and updated AI systems.

Nature medicine·2026
Same author

CACTI: Leveraging Copy Masking and Contextual Information to Improve Tabular Data Imputation.

Proceedings of machine learning research·2026
Same author

A biobank-scale method for learning modulators of gene-environment interaction underlying human complex traits from multiple environmental exposures.

bioRxiv : the preprint server for biology·2026
Same author

Raptor: Scalable Train-Free Embeddings for 3D Medical Volumes Leveraging Pretrained 2D Foundation Models.

Proceedings of machine learning research·2026
Same author

Choice of phenotype scale is critical in biobank-based G×E tests.

bioRxiv : the preprint server for biology·2026
Same journal

Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift.

Proceedings of machine learning research·2026
Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026
Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026
Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026
Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026
See all related articles

Related Experiment Video

Updated: Oct 20, 2025

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.3K

Contra: Contrarian statistics for controlled variable selection.

Mukund Sudarshan1, Aahlad Puli1, Lakshmi Subramanian1

  • 1Courant Institute, New York University.

Proceedings of Machine Learning Research
|September 15, 2021
PubMed
Summary
This summary is machine-generated.

The contrarian randomization test (CONTRA) improves false discovery rate (FDR) control when estimating covariate distributions from data. This method offers better performance than existing techniques, especially with misspecified distributions.

More Related Videos

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.4K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.4K

Related Experiment Videos

Last Updated: Oct 20, 2025

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.3K
Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.4K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.4K

Area of Science:

  • Statistics
  • Machine Learning
  • Bioinformatics

Background:

  • Holdout randomization tests (HRTs) identify predictive covariates but can inflate the false discovery rate (FDR) when covariate distributions are estimated from data.
  • Accurate FDR control is crucial for reliable covariate selection in statistical modeling.

Purpose of the Study:

  • To introduce the contrarian randomization test (CONTRA) for robust FDR control in scenarios with estimated or misspecified covariate distributions.
  • To offer a computationally efficient and powerful alternative to existing methods for high-dimensional data.

Main Methods:

  • CONTRA utilizes an equal mixture of two probabilistic models: one fitted with real data and another with a modified dataset where the tested covariate is replaced by estimated samples.
  • This approach explicitly addresses challenges arising from unknown or misspecified covariate distributions.

Main Results:

  • CONTRA effectively reduces FDR inflation compared to state-of-the-art methods when covariate distributions are misspecified.
  • The method demonstrates asymptotic power of 1 and computational efficiency for high-dimensional and large sample size datasets.
  • Effectiveness validated on synthetic benchmarks and a genetic dataset.

Conclusions:

  • CONTRA provides a robust and efficient solution for covariate selection with improved FDR control, particularly when dealing with estimated or misspecified covariate distributions.
  • The method shows promise for applications in bioinformatics and other fields requiring reliable statistical inference.