Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Calculating and Interpreting the Linear Correlation Coefficient01:11

Calculating and Interpreting the Linear Correlation Coefficient

5.9K
The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable, x, and the dependent variable, y. Hence, it is also known as the Pearson product-moment correlation coefficient. It can be calculated using the following equation:
5.9K
Wilcoxon Signed-Ranks Test for Matched Pairs01:09

Wilcoxon Signed-Ranks Test for Matched Pairs

87
The Wilcoxon signed-rank test for matched pairs evaluates the null hypothesis by combining the ranks of differences with their signs. It essentially tests whether the median of the differences in a population of matched pairs is zero. Since the test incorporates more information than the sign test, it generally yields more trustable conclusions. This test also does not require the data to follow a normal distribution, but two conditions must be met for it to be applicable: (1) the data must...
87
Outliers and Influential Points01:08

Outliers and Influential Points

4.0K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
4.0K
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

147
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
147
Cross Product01:25

Cross Product

226
The cross product is a fundamental concept in vector algebra that is a vector operation on two different vectors to obtain a third vector. Unlike the scalar product, the cross product results in a vector quantity perpendicular to both the original vectors.
The magnitude of the cross product is obtained by multiplying the magnitude of both the vectors and the sine of the angle between them. This means that a larger angle between the vectors will lead to a greater magnitude of the cross product.
226
Coefficient of Correlation01:12

Coefficient of Correlation

6.0K
The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
If you suspect a linear relationship between x and y, then r can measure how strong the linear relationship is.
What the VALUE of r tells us:
The value of r is always between –1 and +1: –1 ≤ r ≤ 1.
The size of the correlation r indicates the...
6.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Incidence and remission of endometriosis in Germany based on prevalence data from 35 million patients from the statutory health insurance.

BMC women's health·2026
Same author

DEVELOPMENT AND APPLICATION OF BRAIN TISSUE BASED MULTI-OMICS PROFILE SCORES FOR ALZHEIMER'S DISEASE.

Research square·2026
Same author

Socio-spatial characterization of sub-sewersheds for wastewater-based epidemiology (WBE): Developing and evaluating two estimators for population-related variables.

Spatial and spatio-temporal epidemiology·2025
Same author

Hierarchical modeling of risk factors with and without prior information-the process of regression model evaluation for an example of respiratory diseases in piglet production from daily practice data.

Frontiers in veterinary science·2025
Same author

Joint models in big data: simulation-based guidelines for required data quality in longitudinal electronic health records.

BioData mining·2025
Same author

A simulation-based framework for modeling and prediction of personalized blood pressure trajectories in hypertensive patients after antihypertensive treatment.

PloS one·2025
Same journal

Ensuring Quality in Preclinical Research: The Importance of Being Human.

Biometrical journal. Biometrische Zeitschrift·2026
Same journal

Addressing Cluster-Level Treatment Effect Heterogeneity in Sample Size Determination for Hierarchical 2 × 2 Factorial Designs.

Biometrical journal. Biometrische Zeitschrift·2026
Same journal

A Multiple Imputation Approach to Distinguish Curative From Life-Prolonging Effects in the Presence of Missing Covariates.

Biometrical journal. Biometrische Zeitschrift·2026
Same journal

Tests for Categorical Data Beyond Pearson: A Distance Covariance and Energy Distance Approach.

Biometrical journal. Biometrische Zeitschrift·2026
Same journal

Nonparametric Estimation of the Patient-Weighted While-Alive Estimand.

Biometrical journal. Biometrische Zeitschrift·2026
Same journal

Two-Stage Multiple Test Procedures Controlling False Discovery Rate With Auxiliary Variable and Their Application to Set4 <math><semantics><mi>Δ</mi> <annotation>$\Delta$</annotation></semantics></math> Mutant Data.

Biometrical journal. Biometrische Zeitschrift·2026
See all related articles

Related Experiment Video

Updated: Jun 6, 2025

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size LEfSe in Microbiome Data
04:57

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size LEfSe in Microbiome Data

Published on: May 16, 2022

15.6K

Detecting Interactions in High-Dimensional Data Using Cross Leverage Scores.

Sven Teschke1, Katja Ickstadt1,2, Alexander Munteanu1

  • 1Faculty of Statistics, TU Dortmund University, Dortmund, Germany.

Biometrical Journal. Biometrische Zeitschrift
|November 29, 2024
PubMed
Summary
This summary is machine-generated.

We developed a scalable method using cross leverage scores (CLSs) to identify gene interactions influencing health outcomes. This approach efficiently detects important genetic interactions in large datasets, including genome-wide data.

Keywords:
cross leverage scoresgeneticshigh‐dimensional datainteraction effectssketchingvariable selection

More Related Videos

High-throughput Identification of Synergistic Drug Combinations by the Overlap2 Method
07:51

High-throughput Identification of Synergistic Drug Combinations by the Overlap2 Method

Published on: May 21, 2018

11.7K
Author Spotlight: Emerging Technologies and Advanced Tools for Decoding Metabolomics Data Analysis
07:11

Author Spotlight: Emerging Technologies and Advanced Tools for Decoding Metabolomics Data Analysis

Published on: November 10, 2023

2.2K

Related Experiment Videos

Last Updated: Jun 6, 2025

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size LEfSe in Microbiome Data
04:57

Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size LEfSe in Microbiome Data

Published on: May 16, 2022

15.6K
High-throughput Identification of Synergistic Drug Combinations by the Overlap2 Method
07:51

High-throughput Identification of Synergistic Drug Combinations by the Overlap2 Method

Published on: May 21, 2018

11.7K
Author Spotlight: Emerging Technologies and Advanced Tools for Decoding Metabolomics Data Analysis
07:11

Author Spotlight: Emerging Technologies and Advanced Tools for Decoding Metabolomics Data Analysis

Published on: November 10, 2023

2.2K

Area of Science:

  • Genetics
  • Statistical genetics
  • Bioinformatics

Background:

  • Investigating gene interactions (e.g., single-nucleotide polymorphisms or SNPs) is crucial for understanding complex health outcomes.
  • Analyzing interactions in large genetic datasets is computationally challenging due to the high dimensionality.

Purpose of the Study:

  • To develop a computationally efficient variable selection method for detecting interactions in large-scale regression models.
  • To introduce and evaluate cross leverage scores (CLSs) for identifying important variable interactions while maintaining interpretability.

Main Methods:

  • Developed a variable selection method based on cross leverage scores (CLSs) for interaction detection.
  • Implemented data batching and windowing techniques to scale computations for large datasets.
  • Utilized sketching-based approximations to further enhance computational efficiency.

Main Results:

  • Cross leverage scores (CLSs) were shown to be directly correlated with the importance of variables in interaction effects.
  • Approximation methods using sketching were found to be effective for large-scale data analysis, preserving the interaction detection capabilities of CLSs.
  • The methods demonstrated scalability for genome-wide data analysis.

Conclusions:

  • The developed CLS method and its approximations offer a scalable solution for identifying gene-gene interactions in large genetic datasets.
  • These methods facilitate efficient analysis of complex genetic architectures influencing health outcomes.
  • The approach is validated through simulations and application to real-world genetic data (HapMap project).