Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Scatter Plot01:15

Scatter Plot

8.5K
The most common and easiest way to display the relationship between two variables, x and y, is a scatter plot. A scatter plot shows the direction of a relationship between the variables. A clear direction happens when there is either:
8.5K
Cluster Sampling Method01:20

Cluster Sampling Method

11.0K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
11.0K
Modified Boxplots00:57

Modified Boxplots

8.0K
A standard box and whisker plot informs us about the spread of the data in a given sample. One can identify the minimum value, maximum value, first quartile value, second quartile or median value, and third quartile.
However, the box plot does not tell the reader about outliers - values that lie far from the center of the data. We can modify the standard box and whisker plot to identify the outliers and visualize the actual spread of the data in a sample.
Initially, we calculate the adjusted...
8.0K
Boxplot01:12

Boxplot

10.7K
Box plots (also called box-and-whisker plots or box-whisker plots) give an excellent graphical image of the concentration of the data. They also show how far the extreme values are from most data. A box plot is constructed from five values: the minimum value, the first quartile, the median, the third quartile, and the maximum value. We use these values to compare how close other data values are to them. To construct a box plot, use a horizontal or vertical number line and a rectangular box. The...
10.7K
Relative Frequency Histogram01:14

Relative Frequency Histogram

4.7K
The relative frequency depicts the proportion of data points that have each value. The frequency tells the number of data points that have each value. Like the histogram, a relative frequency histogram also has the same shape with a horizontal scale (the x-axis), but the vertical scale (the y-axis) is marked with relative frequencies (percentages of the whole) instead of actual frequencies. A relative frequency histogram is a graphical representation of a frequency distribution where the...
4.7K
Residual Plots01:07

Residual Plots

4.7K
A residual plot is a statistical representation of data used to analyze correlation and regression results. It helps verify the requirements for drawing specific conclusions about correlation and regression. To obtain the residual plot, first, the residual for each data value is calculated, which is simply the vertical distance between the observed and the predicted value obtained from the regression equation.
When the residual values are plotted against the variable x, it is called a residual...
4.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Poly(DADMAC) incorporated lipid nanoparticles enhance the delivery of antimicrobial peptides into plant cells.

Scientific reports·2026
Same author

Nematophagous <i>Hyalorbilia</i> spp. isolated from <i>Heterodera schachtii</i> in California parasitize <i>Heterodera glycines</i>.

Journal of nematology·2026
Same author

Design of Highly Specific Antimicrobial Peptides Targeting the BamA Protein of <i>Candidatus</i> Liberibacter Asiaticus.

ACS omega·2026
Same author

High-efficiency genome-editing, transgene evaluation, and antimicrobial efficacy testing using Citrus medica L. hairy roots.

The Plant journal : for cell and molecular biology·2026
Same author

Intestinal epithelial PTPN2 limits pathobiont colonization by immune-directed antimicrobial responses.

Gut microbes·2025
Same author

Unveiling organ-specific metabolism of <i>Citrus clementina</i>.

Proceedings of the National Academy of Sciences of the United States of America·2025
Same journal

PROSPECTIVELY ESTIMATING THE AGE OF INITIATION OF E-CIGARETTES AMONG U.S. YOUTH: FINDINGS FROM THE POPULATION ASSESSMENT OF TOBACCO AND HEALTH (PATH) STUDY, 2013-2017.

Journal of biometrics & biostatistics·2021
Same journal

Simple Power and Sample Size Estimation for Non-Randomized Longitudinal Difference in Differences Studies.

Journal of biometrics & biostatistics·2019
Same journal

Deep Learning Methods for Predicting Disease Status Using Genomic Data.

Journal of biometrics & biostatistics·2019
Same journal

Methods for the Analysis of Missing Data in FMRI Studies.

Journal of biometrics & biostatistics·2019
Same journal

Methods for Analysis of Pre-Post Data in Clinical Research: A Comparison of Five Common Methods.

Journal of biometrics & biostatistics·2018
Same journal

Power Estimation in Planning Randomized Two-Arm Pre-Post Intervention Trials with Repeated Longitudinal Outcomes.

Journal of biometrics & biostatistics·2018
See all related articles

Related Experiment Video

Updated: Apr 28, 2026

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.2K

Clustering Scatter Plots Using Data Depth Measures.

Zhanpan Zhang1, Xinping Cui1, Daniel R Jeske1

  • 1Department of Statistics, University of California, Riverside, CA, USA.

Journal of Biometrics & Biostatistics
|June 3, 2014
PubMed
Summary
This summary is machine-generated.

This study introduces a novel hierarchical clustering method for scatter plot data matrices, enhancing data mining capabilities in fields like bioinformatics. The new approach uses "data depth" for accurate dissimilarity measurement and integrates hypothesis testing for robust clustering and visualization.

Keywords:
ClusteringData DepthQuality IndexScatter PlotVisualization

More Related Videos

Measuring the Behavioral Effects of Intraocular Scatter
05:10

Measuring the Behavioral Effects of Intraocular Scatter

Published on: February 18, 2021

4.9K
Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore
06:01

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

10.1K

Related Experiment Videos

Last Updated: Apr 28, 2026

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.2K
Measuring the Behavioral Effects of Intraocular Scatter
05:10

Measuring the Behavioral Effects of Intraocular Scatter

Published on: February 18, 2021

4.9K
Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore
06:01

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

10.1K

Area of Science:

  • Data Mining
  • Bioinformatics
  • Computational Statistics

Background:

  • Clustering is a key data mining technique widely used in bioinformatics and text mining.
  • Existing clustering methods are limited to scalar data matrices, failing to capture complex bivariate distributions.
  • There is a need for advanced clustering methods capable of handling scatter plot data.

Purpose of the Study:

  • To introduce a novel hierarchical clustering procedure for data matrices of scatter plots.
  • To develop a dissimilarity statistic based on "data depth" for accurate bivariate distribution comparison.
  • To integrate hypothesis testing with clustering for simultaneous row and column analysis and to enable visualization of results.

Main Methods:

  • Developed a hierarchical clustering algorithm specifically designed for scatter plot matrices.
  • Introduced a "data depth" based dissimilarity measure to quantify differences between bivariate distributions.
  • Combined hypothesis testing with hierarchical clustering for joint row-column clustering.
  • Proposed novel painting metrics and heat map construction for cluster visualization.

Main Results:

  • The proposed method successfully clusters data matrices of scatter plots, outperforming existing methods for complex data.
  • The "data depth" metric effectively captures discrepancies in bivariate distributions without oversimplification.
  • Simulations and a microbe-host interaction study demonstrated the method's utility and power.
  • Novel visualization techniques provide insightful representations of identified clusters.

Conclusions:

  • The new hierarchical clustering method offers a powerful tool for analyzing complex, high-dimensional data represented as scatter plots.
  • The "data depth" metric and integrated hypothesis testing provide a robust framework for uncovering hidden patterns in data.
  • This approach has significant potential for applications in bioinformatics, systems biology, and other data-intensive scientific fields.