Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Comparing Copy Number Variations and SNPs

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...

Genome-wide Association Studies-GWAS

Genome-wide Association Studies-GWAS

Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...

Correlation of Experimental Data

Correlation of Experimental Data

Dimensional analysis simplifies complex physical problems and guides experimental investigations, but it does not provide complete solutions. It identifies the dimensionless groups that influence a phenomenon, but experimental data is needed to establish the specific relationships and validate theoretical predictions.
For example, a spherical particle moving through a viscous fluid experiences drag. Dimensional analysis shows that the drag force depends on the particle's diameter, velocity, and...

Coefficient of Correlation

Coefficient of Correlation

The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
If you suspect a linear relationship between x and y, then r can measure how strong the linear relationship is.
What the VALUE of r tells us:
The value of r is always between –1 and +1: –1 ≤ r ≤ 1.
The size of the correlation r indicates the strength of the linear...

Genomics

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...

Calculating and Interpreting the Linear Correlation Coefficient

Calculating and Interpreting the Linear Correlation Coefficient

The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable, x, and the dependent variable, y. Hence, it is also known as the Pearson product-moment correlation coefficient. It can be calculated using the following equation:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same authorSame journal

Balanced mediated pathway detection in genomic data.

Statistical applications in genetics and molecular biology·2026

Same author

Development of a prediction model for infant hospitalisation and death using clinical features assessed by community health workers during routine postnatal home visits in Dhaka, Bangladesh.

BMJ paediatrics open·2026

Same author

Effect of Different Types of Whole Dietary Pulses on Established Therapeutic Lipid Targets for Cardiovascular Risk Reduction: An Updated Systematic Review and Dose-Response Meta-Analysis of Randomized Controlled Trials.

Journal of the American Heart Association·2026

Same author

Prognostic biomarkers of future diabetes in South Asian women diagnosed with gestational diabetes: a prospective cohort study.

BMC endocrine disorders·2026

Same author

The Efficacy of Pharmacotherapy Intervention on Anthropometric Outcomes in Survivors of Childhood Brain Tumors: An Updated Systematic Review and Meta-Analysis.

Obesity reviews : an official journal of the International Association for the Study of Obesity·2026

Same author

A Systematic Review of the Content Validity of Self-Management Outcome Measurement Instruments for Youth With Multiple Sclerosis Transitioning to Adult Care.

The Journal of adolescent health : official publication of the Society for Adolescent Medicine·2026

Same journal

Annealed variational mixtures for disease subtyping and biomarker discovery.

Statistical applications in genetics and molecular biology·2026

Same journal

Performance of the permutation test approach with base calling errors for detecting changes in variant allele frequencies in ctDNA for a single patient.

Statistical applications in genetics and molecular biology·2026

Same journal

BLOG: Bayesian longitudinal omics with group constraints.

Statistical applications in genetics and molecular biology·2026

Same journal

AI-driven risk prediction and categorization in cystic fibrosis leveraging AttentiveLSTM and Fox Wolf Optimizer.

Statistical applications in genetics and molecular biology·2026

Same journal

Perfect collinearity not created equal: measuring and visualizing the severity of multi-collinearity of modern omics data.

Statistical applications in genetics and molecular biology·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 25, 2026

CorrelationCalculator and Filigree: Tools for Data-Driven Network Analysis of Metabolomics Data

CorrelationCalculator and Filigree: Tools for Data-Driven Network Analysis of Metabolomics Data

Published on: November 10, 2023

Sparse canonical correlation analysis with application to genomic data integration.

Elena Parkhomenko¹, David Tritchler, Joseph Beyene

¹Hospital for Sick Children Research Institute. elena@utstat.toronto.edu

Statistical Applications in Genetics and Molecular Biology

|February 19, 2009

Summary

This summary is machine-generated.

Sparse Canonical Correlation Analysis (SCCA) identifies relationships between variable sets in high-dimensional genomic data. This method selects sparse subsets of variables, improving biological interpretability and computational efficiency for complex analyses.

More Related Videos

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

Related Experiment Videos

Last Updated: Jun 25, 2026

CorrelationCalculator and Filigree: Tools for Data-Driven Network Analysis of Metabolomics Data

CorrelationCalculator and Filigree: Tools for Data-Driven Network Analysis of Metabolomics Data

Published on: November 10, 2023

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

Area of Science:

Genomics
Statistical Genetics
Bioinformatics

Background:

Large-scale genomic studies involve complex multivariate relationships between phenotypic and genotypic data.
Traditional canonical correlation analysis (CCA) struggles with high-dimensional data due to lack of biological plausibility and interpretability.
Insufficient sample sizes exacerbate computational issues and reduce generalizability in high-dimensional CCA.

Purpose of the Study:

To introduce Sparse Canonical Correlation Analysis (SCCA) for identifying multivariate relationships in high-dimensional data.
To develop an extension, adaptive SCCA, addressing limitations of standard SCCA.
To enhance biological interpretability and computational efficiency in genomic data analysis.

Main Methods:

Sparse Canonical Correlation Analysis (SCCA) maximizes correlation between variable subsets while performing variable selection.
Adaptive SCCA, an extension of SCCA, further refines variable selection.
Evaluation using simulated data and application to human gene expression data.

Main Results:

SCCA provides sparse solutions, selecting small subsets of variables for improved interpretability.
Adaptive SCCA offers an enhanced approach to variable selection in high-dimensional correlation analysis.
Both methods demonstrate utility in analyzing natural variation in human gene expression.

Conclusions:

SCCA and adaptive SCCA are effective for analyzing high-dimensional genomic data, offering interpretable and computationally feasible solutions.
These methods address key challenges in multivariate analysis of large-scale biological datasets.
The application to human gene expression highlights their practical relevance in biological research.