Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cluster Sampling Method01:20

Cluster Sampling Method

13.8K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
13.8K
Scatter Plot01:15

Scatter Plot

10.5K
The most common and easiest way to display the relationship between two variables, x and y, is a scatter plot. A scatter plot shows the direction of a relationship between the variables. A clear direction happens when there is either:
10.5K
Vesicular Tubular Clusters01:45

Vesicular Tubular Clusters

2.9K
After budding out from the ER membrane, some COPII vesicles lose their coat and fuse with one another to form larger vesicles and interconnected tubules called vesicular tubular clusters or VTCs. These clusters constitute a compartment at the ER-Golgi interface known as ERGIC (Endoplasmic Reticulum Golgi Intermediate Compartment). The ERGIC is a mobile membrane-bound cargo transport system that sorts proteins secreted from ER and delivers them to the Golgi.
With the help of motor proteins such...
2.9K
Phylogenetic Trees03:21

Phylogenetic Trees

48.9K
Phylogenetic trees come in many forms. It matters in which sequence the organisms are arranged from the bottom to the top of the tree, but the branches can rotate at their nodes without altering the information. The lines connecting individual nodes can be straight, angled, or even curved.
48.9K
Choosing Between z and t Distribution01:25

Choosing Between z and t Distribution

3.4K
The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...
3.4K
Kendall's Coefficient of Concordance01:20

Kendall's Coefficient of Concordance

805
Kendall's Coefficient of Concordance (W), also known as Kendall's W, is a non-parametric statistical measure used to assess the agreement or concordance between multiple raters or judges when they rank a set of items. It is often used when you have ordinal data (ranks) and you want to see if there is consistency or consensus among the raters. It is widely applied in research areas such as psychology, medicine, and social sciences, where multiple judges are asked to rank or rate subjects...
805

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

ASO Visual Abstract: Do Postoperative Complications Impact Adjuvant Chemotherapy for Patients Undergoing Left-Side Pancreatectomy for Pancreatic Cancer?

Annals of surgical oncology·2026
Same author

Hyperintense FLAIR signal in the anterior cranial fossa.

Nature communications·2026
Same author

Do Postoperative Complications Impact Adjuvant Chemotherapy in Patients Undergoing Left-Side Pancreatectomy for Pancreatic Cancer?

Annals of surgical oncology·2026
Same author

The Association of Extended Venous Thromboembolism Prophylaxis and Venous Thromboembolism After Cancer Surgery.

Journal of the National Comprehensive Cancer Network : JNCCN·2026
Same author

C-DIR: Double Inversion Recovery with Controlled Artifact Suppression in Brain MRI.

AJNR. American journal of neuroradiology·2026
Same author

C-FLAIR: Fluid-attenuated Inversion Recovery with Controlled Artifact Suppression in Brain MRI.

Radiology·2025
Same journal

Optimally Weighted PCA for High-Dimensional Heteroscedastic Data.

SIAM journal on mathematics of data science·2026
Same journal

Rank <math><mn>2</mn> <mi>r</mi></math> Iterative Least Squares: Efficient Recovery of III-Conditioned Low Rank Matrices from Few Entries.

SIAM journal on mathematics of data science·2025
Same journal

Supervised Gromov-Wasserstein Optimal Transport with Metric-Preserving Constraints.

SIAM journal on mathematics of data science·2025
Same journal

GNMR: A Provable One-Line Algorithm for Low Rank Matrix Recovery.

SIAM journal on mathematics of data science·2025
Same journal

Causal Structural Learning via Local Graphs.

SIAM journal on mathematics of data science·2024
Same journal

The Convex Mixture Distribution: Granger Causality for Categorical Time Series.

SIAM journal on mathematics of data science·2023
See all related articles

Related Experiment Video

Updated: Dec 5, 2025

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.8K

Clustering with t-SNE, provably.

George C Linderman1, Stefan Steinerberger2

  • 1Program in Applied Mathematics, Yale University, New Haven, CT 06511, USA.

SIAM Journal on Mathematics of Data Science
|October 19, 2020
PubMed
Summary
This summary is machine-generated.

This study mathematically proves that t-distributed Stochastic Neighborhood Embedding (t-SNE) can recover well-separated clusters, particularly during its early exaggeration phase. The findings offer new guidelines for optimizing t-SNE parameters, enhancing visualization quality.

Keywords:
convergence ratesdimensionality reductionspectral clusteringt-SNEtheoretical guarantees

More Related Videos

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.2K
Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore
06:01

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

8.8K

Related Experiment Videos

Last Updated: Dec 5, 2025

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.8K
Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.2K
Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore
06:01

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

8.8K

Area of Science:

  • Computational statistics
  • Machine learning
  • Data visualization

Background:

  • t-distributed Stochastic Neighborhood Embedding (t-SNE) is a widely adopted clustering and visualization technique in natural sciences.
  • Despite its prevalence, t-SNE lacks rigorous mathematical foundations, and its internal mechanisms remain poorly understood.
  • Existing research highlights the need for a deeper theoretical understanding of t-SNE's behavior and optimization.

Purpose of the Study:

  • To provide a mathematical proof demonstrating t-SNE's capability to recover well-separated clusters.
  • To rigorously analyze the 'early exaggeration' phase of t-SNE, a key optimization technique.
  • To derive novel insights for setting the exaggeration parameter (α) and step size (h).

Main Methods:

  • Mathematical proof of t-SNE's cluster recovery capabilities.
  • Analysis of the 'early exaggeration' phase using theoretical frameworks.
  • Numerical experiments to validate proposed parameter tuning rules.

Main Results:

  • The study rigorously proves that t-SNE can effectively recover well-separated clusters.
  • The analysis of the 'early exaggeration' phase provides a theoretical basis for its optimization.
  • Novel rules for setting the exaggeration parameter (α) and step size (h) are proposed and validated.

Conclusions:

  • The mathematical framework confirms t-SNE's effectiveness in preserving cluster structures.
  • The proposed parameter tuning strategies enhance the quality of embeddings, especially for topological structures like the swiss roll.
  • A connection between t-SNE and spectral clustering methods is explored, suggesting potential for hybrid approaches.