Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Scatter Plot

Scatter Plot

The most common and easiest way to display the relationship between two variables, x and y, is a scatter plot. A scatter plot shows the direction of a relationship between the variables. A clear direction happens when there is either:

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Modified Boxplots

Modified Boxplots

A standard box and whisker plot informs us about the spread of the data in a given sample. One can identify the minimum value, maximum value, first quartile value, second quartile or median value, and third quartile.
However, the box plot does not tell the reader about outliers - values that lie far from the center of the data. We can modify the standard box and whisker plot to identify the outliers and visualize the actual spread of the data in a sample.
Initially, we calculate the adjusted...

Boxplot

Boxplot

Box plots (also called box-and-whisker plots or box-whisker plots) give an excellent graphical image of the concentration of the data. They also show how far the extreme values are from most data. A box plot is constructed from five values: the minimum value, the first quartile, the median, the third quartile, and the maximum value. We use these values to compare how close other data values are to them. To construct a box plot, use a horizontal or vertical number line and a rectangular box. The...

Relative Frequency Histogram

Relative Frequency Histogram

The relative frequency depicts the proportion of data points that have each value. The frequency tells the number of data points that have each value. Like the histogram, a relative frequency histogram also has the same shape with a horizontal scale (the x-axis), but the vertical scale (the y-axis) is marked with relative frequencies (percentages of the whole) instead of actual frequencies. A relative frequency histogram is a graphical representation of a frequency distribution where the...

Residual Plots

Residual Plots

A residual plot is a statistical representation of data used to analyze correlation and regression results. It helps verify the requirements for drawing specific conclusions about correlation and regression. To obtain the residual plot, first, the residual for each data value is calculated, which is simply the vertical distance between the observed and the predicted value obtained from the regression equation.
When the residual values are plotted against the variable x, it is called a residual...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Poly(DADMAC) incorporated lipid nanoparticles enhance the delivery of antimicrobial peptides into plant cells.

Scientific reports·2026

Same author

Nematophagous <i>Hyalorbilia</i> spp. isolated from <i>Heterodera schachtii</i> in California parasitize <i>Heterodera glycines</i>.

Journal of nematology·2026

Same author

Design of Highly Specific Antimicrobial Peptides Targeting the BamA Protein of <i>Candidatus</i> Liberibacter Asiaticus.

ACS omega·2026

Same author

High-efficiency genome-editing, transgene evaluation, and antimicrobial efficacy testing using Citrus medica L. hairy roots.

The Plant journal : for cell and molecular biology·2026

Same author

Intestinal epithelial PTPN2 limits pathobiont colonization by immune-directed antimicrobial responses.

Gut microbes·2025

Same author

Unveiling organ-specific metabolism of <i>Citrus clementina</i>.

Proceedings of the National Academy of Sciences of the United States of America·2025

Same journal

PROSPECTIVELY ESTIMATING THE AGE OF INITIATION OF E-CIGARETTES AMONG U.S. YOUTH: FINDINGS FROM THE POPULATION ASSESSMENT OF TOBACCO AND HEALTH (PATH) STUDY, 2013-2017.

Journal of biometrics & biostatistics·2021

Same journal

Simple Power and Sample Size Estimation for Non-Randomized Longitudinal Difference in Differences Studies.

Journal of biometrics & biostatistics·2019

Same journal

Deep Learning Methods for Predicting Disease Status Using Genomic Data.

Journal of biometrics & biostatistics·2019

Same journal

Methods for the Analysis of Missing Data in FMRI Studies.

Journal of biometrics & biostatistics·2019

Same journal

Methods for Analysis of Pre-Post Data in Clinical Research: A Comparison of Five Common Methods.

Journal of biometrics & biostatistics·2018

Same journal

Power Estimation in Planning Randomized Two-Arm Pre-Post Intervention Trials with Repeated Longitudinal Outcomes.

Journal of biometrics & biostatistics·2018

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 28, 2026

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Clustering Scatter Plots Using Data Depth Measures.

Zhanpan Zhang¹, Xinping Cui¹, Daniel R Jeske¹

¹Department of Statistics, University of California, Riverside, CA, USA.

Journal of Biometrics & Biostatistics

|June 3, 2014

Summary

This summary is machine-generated.

This study introduces a novel hierarchical clustering method for scatter plot data matrices, enhancing data mining capabilities in fields like bioinformatics. The new approach uses "data depth" for accurate dissimilarity measurement and integrates hypothesis testing for robust clustering and visualization.

Keywords:

Clustering Data Depth Quality Index Scatter Plot Visualization

More Related Videos

Measuring the Behavioral Effects of Intraocular Scatter

Measuring the Behavioral Effects of Intraocular Scatter

Published on: February 18, 2021

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

Related Experiment Videos

Last Updated: Apr 28, 2026

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Measuring the Behavioral Effects of Intraocular Scatter

Measuring the Behavioral Effects of Intraocular Scatter

Published on: February 18, 2021

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Visualization and Quantification of High-Dimensional Cytometry Data using Cytofast and the Upstream Clustering Methods FlowSOM and Cytosplore

Published on: December 12, 2019

Area of Science:

Data Mining
Bioinformatics
Computational Statistics

Background:

Clustering is a key data mining technique widely used in bioinformatics and text mining.
Existing clustering methods are limited to scalar data matrices, failing to capture complex bivariate distributions.
There is a need for advanced clustering methods capable of handling scatter plot data.

Purpose of the Study:

To introduce a novel hierarchical clustering procedure for data matrices of scatter plots.
To develop a dissimilarity statistic based on "data depth" for accurate bivariate distribution comparison.
To integrate hypothesis testing with clustering for simultaneous row and column analysis and to enable visualization of results.

Main Methods:

Developed a hierarchical clustering algorithm specifically designed for scatter plot matrices.
Introduced a "data depth" based dissimilarity measure to quantify differences between bivariate distributions.
Combined hypothesis testing with hierarchical clustering for joint row-column clustering.
Proposed novel painting metrics and heat map construction for cluster visualization.

Main Results:

The proposed method successfully clusters data matrices of scatter plots, outperforming existing methods for complex data.
The "data depth" metric effectively captures discrepancies in bivariate distributions without oversimplification.
Simulations and a microbe-host interaction study demonstrated the method's utility and power.
Novel visualization techniques provide insightful representations of identified clusters.

Conclusions:

The new hierarchical clustering method offers a powerful tool for analyzing complex, high-dimensional data represented as scatter plots.
The "data depth" metric and integrated hypothesis testing provide a robust framework for uncovering hidden patterns in data.
This approach has significant potential for applications in bioinformatics, systems biology, and other data-intensive scientific fields.