Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Coefficient of Variation

Coefficient of Variation

The coefficient of variation measures the dispersion of the data points or distribution around the mean. Using the coefficient of variation, we can compare two data series with drastically different means or different units of measurement. The coefficient of variation for a sample and a population is expressed as a percentage of the ratio of standard deviation to the mean.
The coefficient of variation is a practical statistical tool in finance. It allows investors to assess the volatility or...

Residual Plots

Residual Plots

A residual plot is a statistical representation of data used to analyze correlation and regression results. It helps verify the requirements for drawing specific conclusions about correlation and regression. To obtain the residual plot, first, the residual for each data value is calculated, which is simply the vertical distance between the observed and the predicted value obtained from the regression equation.
When the residual values are plotted against the variable x, it is called a residual...

Kendall's Coefficient of Concordance

Kendall's Coefficient of Concordance

Kendall's Coefficient of Concordance (W), also known as Kendall's W, is a non-parametric statistical measure used to assess the agreement or concordance between multiple raters or judges when they rank a set of items. It is often used when you have ordinal data (ranks) and you want to see if there is consistency or consensus among the raters. It is widely applied in research areas such as psychology, medicine, and social sciences, where multiple judges are asked to rank or rate subjects or...

Decision Making: P-value Method

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim is also stated. These statements can act as null and alternative hypotheses: a null hypothesis would be a neutral statement while the alternative hypothesis can have a...

Coefficient of Correlation

Coefficient of Correlation

The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
If you suspect a linear relationship between x and y, then r can measure how strong the linear relationship is.
What the VALUE of r tells us:
The value of r is always between –1 and +1: –1 ≤ r ≤ 1.
The size of the correlation r indicates the strength of the linear...

z Scores and Area Under the Curve

z Scores and Area Under the Curve

z scores are the standardized values obtained after converting a normal distribution into a standard normal distribution. A z score is measured in units of the standard deviation. The z score tells you how many standard deviations the value x is above (to the right of) or below (to the left of) the mean, μ. Values of x that are larger than the mean have positive z scores, and values of x that are smaller than the mean have negative z scores. If x equals the mean, then x has a z score of zero.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Cleavage of MEP-1 by DPF-3 reveals novel substrate specificity and its impact on reproductive fitness.

EMBO reports·2026

Same author

Thymic output in human newborns is shaped by environmental exposures and a common TCRD genetic variant.

Journal of human immunity·2026

Same author

An in vitro menstrual cycle using organoids captures epithelial cell transitions during menstruation and regeneration of the human endometrium.

Cell stem cell·2026

Same author

The AML cellular state space unveils NPM1 immune evasion subtypes with distinct clinical outcomes.

Nature communications·2025

Same author

Building a continuous benchmarking ecosystem in bioinformatics.

PLoS computational biology·2025

Same author

Cleavage of MEP-1 by DPF-3 Reveals Novel Substrate Specificity and Its Impact on Reproductive Fitness.

bioRxiv : the preprint server for biology·2025

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026

Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026

Same journal

Benchmarking DNA barcode decoding strategies under high error rates.

BMC bioinformatics·2026

Same journal

pyVIPER: a fast and scalable Python package for protein activity estimation and master regulator analysis of single-cell RNA sequencing data.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 30, 2026

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

The projection score--an evaluation criterion for variable subset selection in PCA visualization.

Magnus Fontes¹, Charlotte Soneson

¹Centre for Mathematical Sciences, Lund University, Box 118, SE-221 00 Lund, Sweden. fontes@maths.lth.se

BMC Bioinformatics

|July 30, 2011

Summary

This summary is machine-generated.

A new projection score objectively measures variable subset informativeness for Principal Component Analysis (PCA) visualization. This method enhances hypothesis generation from high-dimensional data by identifying optimal variable subsets for clearer interpretations.

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Related Experiment Videos

Last Updated: May 30, 2026

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

Data Science
Bioinformatics
Statistical Visualization

Background:

High-dimensional datasets are increasingly common in exploratory research for hypothesis generation.
Principal Component Analysis (PCA) is a key visualization method, but can be obscured by non-informative variables.
Current variable filtering methods for PCA lack objective criteria for optimal subset selection.

Purpose of the Study:

To introduce an objective measure for assessing variable subset informativeness in PCA visualization.
To enable systematic selection of optimal variable subsets for improved exploratory data analysis.

Main Methods:

Development of the 'projection score,' a measure of variable subset informativeness for PCA.
Application of the projection score to identify optimal variable subsets across different filtering techniques.
Validation using both microarray and synthetic datasets.

Main Results:

The projection score effectively quantifies the informativeness of variable subsets for PCA.
Optimal variable subsets were identified for various filtering methods, leading to enhanced visualization clarity.
The score's applicability was demonstrated across diverse datasets.

Conclusions:

The projection score is an interpretable and universally applicable metric for PCA visualization.
It facilitates systematic identification of the most informative variable subsets for practical exploratory analysis.
This method improves the generation of relevant hypotheses from complex data.