Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Coefficient of Variation01:10

Coefficient of Variation

The coefficient of variation measures the dispersion of the data points or distribution around the mean. Using the coefficient of variation, we can compare two data series with drastically different means or different units of measurement. The coefficient of variation for a sample and a population is expressed as a percentage of the ratio of standard deviation to the mean.
The coefficient of variation is a practical statistical tool in finance. It allows investors to assess the volatility or...
Residual Plots01:07

Residual Plots

A residual plot is a statistical representation of data used to analyze correlation and regression results. It helps verify the requirements for drawing specific conclusions about correlation and regression. To obtain the residual plot, first, the residual for each data value is calculated, which is simply the vertical distance between the observed and the predicted value obtained from the regression equation.
When the residual values are plotted against the variable x, it is called a residual...
Kendall's Coefficient of Concordance01:20

Kendall's Coefficient of Concordance

Kendall's Coefficient of Concordance (W), also known as Kendall's W, is a non-parametric statistical measure used to assess the agreement or concordance between multiple raters or judges when they rank a set of items. It is often used when you have ordinal data (ranks) and you want to see if there is consistency or consensus among the raters. It is widely applied in research areas such as psychology, medicine, and social sciences, where multiple judges are asked to rank or rate subjects or...
Decision Making: P-value Method01:09

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim  is also stated. These statements can act as null and alternative hypotheses:  a null hypothesis would be a neutral statement while the alternative hypothesis can have a...
Coefficient of Correlation01:12

Coefficient of Correlation

The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
If you suspect a linear relationship between x and y, then r can measure how strong the linear relationship is.
What the VALUE of r tells us:
The value of r is always between –1 and +1: –1 ≤ r ≤ 1.
The size of the correlation r indicates the strength of the linear...
z Scores and Area Under the Curve01:17

z Scores and Area Under the Curve

z scores are the standardized values obtained after converting a normal distribution into a standard normal distribution. A z score is measured in units of the standard deviation. The z score tells you how many standard deviations the value x is above (to the right of) or below (to the left of) the mean, μ. Values of x that are larger than the mean have positive z scores, and values of x that are smaller than the mean have negative z scores. If x equals the mean, then x has a z score of zero.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Cleavage of MEP-1 by DPF-3 reveals novel substrate specificity and its impact on reproductive fitness.

EMBO reports·2026
Same author

Thymic output in human newborns is shaped by environmental exposures and a common TCRD genetic variant.

Journal of human immunity·2026
Same author

An in vitro menstrual cycle using organoids captures epithelial cell transitions during menstruation and regeneration of the human endometrium.

Cell stem cell·2026
Same author

The AML cellular state space unveils NPM1 immune evasion subtypes with distinct clinical outcomes.

Nature communications·2025
Same author

Building a continuous benchmarking ecosystem in bioinformatics.

PLoS computational biology·2025
Same author

Cleavage of MEP-1 by DPF-3 Reveals Novel Substrate Specificity and Its Impact on Reproductive Fitness.

bioRxiv : the preprint server for biology·2025
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
Same journal

Benchmarking DNA barcode decoding strategies under high error rates.

BMC bioinformatics·2026
Same journal

pyVIPER: a fast and scalable Python package for protein activity estimation and master regulator analysis of single-cell RNA sequencing data.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: May 30, 2026

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

The projection score--an evaluation criterion for variable subset selection in PCA visualization.

Magnus Fontes1, Charlotte Soneson

  • 1Centre for Mathematical Sciences, Lund University, Box 118, SE-221 00 Lund, Sweden. fontes@maths.lth.se

BMC Bioinformatics
|July 30, 2011
PubMed
Summary
This summary is machine-generated.

A new projection score objectively measures variable subset informativeness for Principal Component Analysis (PCA) visualization. This method enhances hypothesis generation from high-dimensional data by identifying optimal variable subsets for clearer interpretations.

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Related Experiment Videos

Last Updated: May 30, 2026

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

  • Data Science
  • Bioinformatics
  • Statistical Visualization

Background:

  • High-dimensional datasets are increasingly common in exploratory research for hypothesis generation.
  • Principal Component Analysis (PCA) is a key visualization method, but can be obscured by non-informative variables.
  • Current variable filtering methods for PCA lack objective criteria for optimal subset selection.

Purpose of the Study:

  • To introduce an objective measure for assessing variable subset informativeness in PCA visualization.
  • To enable systematic selection of optimal variable subsets for improved exploratory data analysis.

Main Methods:

  • Development of the 'projection score,' a measure of variable subset informativeness for PCA.
  • Application of the projection score to identify optimal variable subsets across different filtering techniques.
  • Validation using both microarray and synthetic datasets.

Main Results:

  • The projection score effectively quantifies the informativeness of variable subsets for PCA.
  • Optimal variable subsets were identified for various filtering methods, leading to enhanced visualization clarity.
  • The score's applicability was demonstrated across diverse datasets.

Conclusions:

  • The projection score is an interpretable and universally applicable metric for PCA visualization.
  • It facilitates systematic identification of the most informative variable subsets for practical exploratory analysis.
  • This method improves the generation of relevant hypotheses from complex data.