Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

1.7K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
1.7K
Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

84
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
84
Significance Testing: Overview01:04

Significance Testing: Overview

3.4K
Significance testing is a set of statistical methods used to test whether a claim about a parameter is valid. In analytical chemistry, significance testing is used primarily to determine whether the difference between two values comes from determinate or random errors. The effect of a particular change in the measurement protocol, analyst, or sample itself can cause a deviation from the expected result. In the case of a suspected deviation/outlier, we need to be able to confirm mathematically...
3.4K
Wald-Wolfowitz Runs Test I01:17

Wald-Wolfowitz Runs Test I

682
The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...
682
One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation01:24

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

587
This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...
587
Variability: Analysis01:11

Variability: Analysis

162
Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...
162

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

TyG Index and Frailty as Composite Biomarkers of Cardiometabolic Risk and Mortality Across CKM Stages 0-3.

Metabolites·2026
Same author

Near-Infrared Fluorescent Probes Targeting LAG-3 for Guiding Immunomodulation and Efficacy Monitoring of Stereotactic Body Radiotherapy in Liver Cancer.

Journal of hepatocellular carcinoma·2026
Same author

Optimal gene panel selection for targeted spatial transcriptomics experiments.

Nucleic acids research·2026
Same author

Decoding Choroid Plexus Pathology in Alzheimer's Disease: A Longitudinal Radiomics Approach for Prodromal Identification and Risk Stratification.

CNS neuroscience & therapeutics·2026
Same author

A chromosome-level genome assembly of Lycoris radiata unveils evolutionary origin of Amaryllidaceae alkaloids and elucidates the complete pathway of galanthamine biosynthesis.

Plant communications·2026
Same author

Creep Characteristics and Damage Constitutive Model of White Sandstone Under Short-Term Freeze-Thaw Cycles.

Materials (Basel, Switzerland)·2026
Same journal

Instrumental Variable Estimation of Marginal Structural Mean Models for Time-Varying Treatment.

Journal of the American Statistical Association·2026
Same journal

Semiparametric Joint Modeling for Survival Analysis with Longitudinal Covariates.

Journal of the American Statistical Association·2026
Same journal

Dimension Reduction for Large-Scale Federated Data: Statistical Rate and Asymptotic Inference.

Journal of the American Statistical Association·2026
Same journal

Facilitating Heterogeneous Effect Estimation via Statistically Efficient Categorical Modifiers.

Journal of the American Statistical Association·2026
Same journal

Nonparametric Density Estimation of a Long-Term Trend from Repeated Semicontinuous Data.

Journal of the American Statistical Association·2026
Same journal

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Clinicogenomic Data.

Journal of the American Statistical Association·2026
See all related articles

Related Experiment Video

Updated: Jul 26, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K

A Model-free Variable Screening Method Based on Leverage Score.

Wenxuan Zhong1, Yiwen Liu2, Peng Zeng3

  • 1Department of Statistics, University of Georgia, Athens, GA, 30602.

Journal of the American Statistical Association
|June 22, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces a novel weighted leverage variable screening method for analyzing massive scientific datasets. The method efficiently identifies true predictors in complex models, demonstrating success in gene identification from spatial transcriptome data.

Keywords:
Bayesian information criteriaGeneral index modelLeverage scoreSingular value decompositionVariable screening

More Related Videos

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size LEfSe in Microbiome Data
04:57

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size LEfSe in Microbiome Data

Published on: May 16, 2022

16.0K
Pooled CRISPR-Based Genetic Screens in Mammalian Cells
00:09

Pooled CRISPR-Based Genetic Screens in Mammalian Cells

Published on: September 4, 2019

22.0K

Related Experiment Videos

Last Updated: Jul 26, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K
Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size LEfSe in Microbiome Data
04:57

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size LEfSe in Microbiome Data

Published on: May 16, 2022

16.0K
Pooled CRISPR-Based Genetic Screens in Mammalian Cells
00:09

Pooled CRISPR-Based Genetic Screens in Mammalian Cells

Published on: September 4, 2019

22.0K

Area of Science:

  • Data science and computational biology.
  • Statistical learning and machine learning applications in scientific research.

Background:

  • Massive datasets in science necessitate efficient data analysis methods.
  • Conventional statistical learning techniques face computational challenges with large sample sizes and numerous predictors.
  • Leverage score sampling has shown promise for linear regression but not for variable selection.

Purpose of the Study:

  • To propose a novel weighted leverage variable screening method for effective variable selection in large-scale datasets.
  • To extend the application of leverage score sampling beyond linear regression to general index models.
  • To address the computational challenges in extracting meaningful information from massive scientific data.

Main Methods:

  • Development of a weighted leverage variable screening method utilizing both left and right singular vectors of the design matrix.
  • Theoretical analysis to demonstrate the consistency of selected predictors.
  • Empirical validation through extensive simulation studies and application to real-world biological data.

Main Results:

  • The proposed method consistently includes true predictors for both linear and general index models.
  • Weighted leverage screening is shown to be computationally efficient and effective.
  • Successful identification of carcinoma-related genes using spatial transcriptome data.

Conclusions:

  • The weighted leverage variable screening method offers a computationally efficient and effective approach for variable selection in massive datasets.
  • This method advances the application of leverage score sampling for complex statistical modeling.
  • The approach has practical implications for biological data analysis, such as identifying disease-related genes.