Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Skewness

Skewness

The measures of central tendency calculated from a data set may not reveal much about its intrinsic distribution. If a plot is made of the data set’s values, the mean and the median may not only differ, but also the plot may have more values on one side of the central tendencies. Such a data set is said to be skewed towards that side.
The longer the tail of the plot on one side, the more skewed it is. The skewness of a data set’s values suggests that the measures of central tendency...

Types of Skewness

Types of Skewness

If the frequency distribution of a data set is more inclined towards smaller or larger values, the distribution is said to be skewed. If data values are skewed to the right, then the distribution is called positively skewed. Conversely, if the plot is skewed to the left, the distribution is called negatively skewed.
For instance, in the middle of a pandemic, the geographical distribution of vaccine coverage may be positively skewed towards populations in the global north countries. However,...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Probability Histograms

Probability Histograms

A probability histogram is a visual representation of a probability distribution. Similar a typical histogram, the probability histogram consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents. The vertical axis is labeled with probability. Each rectangular bar in the histogram is 1 unit wide, which suggests that the area under each bar equals the probability, P(x), where x is 1, 2, 3, and so on.

Shape and Texture of Coarse Aggregate

Shape and Texture of Coarse Aggregate

Aggregate shape is classified based on the relative sharpness or roundness of the edges and corners. This classification includes categories like rounded, angular, elongated, and flaky, each with specific characteristics. Rounded aggregates, fully shaped by attrition, are typical of river or seashore gravel, while angular aggregates, such as crushed rock, have well-defined edges. Aggregates that are elongated and flaky are less desirable, as they can reduce the workability and strength of...

Statgraphics

Statgraphics

Statgraphics is a comprehensive statistical software suite designed for both basic and advanced data analysis. Originating in 1980 at Princeton University under Dr. Neil W. Polhemus, it was one of the pioneering tools for statistical computing on personal computers, with its public release in 1982 marking an early milestone in data science software. Over the years, it has evolved into a robust platform for data science, offering tools for regression analysis, ANOVA, multivariate statistics,...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Cell growth rates coordinate across the width of the leaf to remain flat.

bioRxiv : the preprint server for biology·2025

Same author

A common pathway controls cell size in the sepal and leaf epidermis leading to a nonrandom pattern of giant cells.

PLoS biology·2025

Same author

Glassy dynamics near the interpolation transition in deep recurrent networks.

Physical review. E·2025

Same author

Commentary to "Application of Non-equilibrium Physics": a session of the 21st IUPAB Congress 2024, Kyoto, Japan.

Biophysical reviews·2024

Same author

Robust organ size in Arabidopsis is primarily governed by cell growth rather than cell division patterns.

Development (Cambridge, England)·2024

Same author

The relationship between cell density and cell count differs among <i>Saccharomyces</i> yeast species.

microPublication biology·2024

Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026

Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026

Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 22, 2025

Characterization of Aquatic Biofilms with Flow Cytometry

Characterization of Aquatic Biofilms with Flow Cytometry

Published on: June 6, 2018

Shape-aware stochastic neighbor embedding for robust data visualisations.

Tobias Wängberg¹, Joanna Tyrcha², Chun-Biu Li³

¹Department of Mathematics, Stockholm University, Stockholm, Sweden.

BMC Bioinformatics

|November 15, 2022

Summary

This summary is machine-generated.

A new shape-aware stochastic neighbor embedding method improves visualization of high-dimensional data by accurately representing cluster structures and hierarchies, outperforming t-SNE, UMAP, and PHATE.

Keywords:

Data visualisation Dimensionality reduction Dimensionality reduction validation Graph distance

More Related Videos

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Published on: February 23, 2024

Trajectory Data Analyses for Pedestrian Space-time Activity Study

Trajectory Data Analyses for Pedestrian Space-time Activity Study

Published on: February 25, 2013

Related Experiment Videos

Last Updated: Aug 22, 2025

Characterization of Aquatic Biofilms with Flow Cytometry

Characterization of Aquatic Biofilms with Flow Cytometry

Published on: June 6, 2018

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Published on: February 23, 2024

Trajectory Data Analyses for Pedestrian Space-time Activity Study

Trajectory Data Analyses for Pedestrian Space-time Activity Study

Published on: February 25, 2013

Area of Science:

Data visualization
Machine learning
Bioinformatics

Background:

t-distributed Stochastic Neighbor Embedding (t-SNE) is widely used for high-dimensional data visualization, particularly in single-cell transcriptomics.
t-SNE struggles with accurately representing hierarchical relationships and can create spurious patterns.
Existing methods like UMAP and PHATE have limitations in addressing t-SNE's shortcomings.

Purpose of the Study:

To generalize t-SNE using shape-aware graph distances to overcome its limitations.
To develop a method that accurately visualizes hierarchical structures in high-dimensional data.
To provide a robust alternative for dimensionality reduction and data visualization.

Main Methods:

Generalization of t-SNE using shape-aware graph distances.
Application of the proposed method to simulated and real-world datasets (single-cell transcriptomics, MNIST).
Quantitative validation using established indices and comparison with t-SNE, UMAP, and PHATE.

Main Results:

The proposed method significantly outperforms t-SNE, UMAP, and PHATE on imbalanced, nonlinear, continuous, and hierarchically structured data.
Faithful low-dimensional embeddings were achieved on both simulated and real-world datasets.
The method's single hyper-parameter can be automatically and optimally selected in a data-driven manner.

Conclusions:

The shape-aware stochastic neighbor embedding method provides robust and accurate low-dimensional visualizations.
This method effectively reveals key structures within high-dimensional data.
It offers a significant improvement over existing dimensionality reduction techniques for complex datasets.