Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

How Data are Classified: Categorical Data01:11

How Data are Classified: Categorical Data

48.2K
A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...
48.2K
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

567
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
567
Contingency Table01:29

Contingency Table

4.9K
A contingency table provides a way of portraying data that can facilitate calculating probabilities. It is a method of displaying a frequency distribution as a table with rows and columns to show how two variables may be dependent (contingent) upon each other; The table helps determine conditional probabilities quite quickly and can help systematically organize, analyze and quantify data. The table displays sample values concerning two variables that may be dependent or contingent on one...
4.9K
Ranks01:02

Ranks

578
Unlike parametric methods, nonparametric statistics are ideal for nominal and ordinal data, requiring fewer assumptions about the population's nature or distribution. This makes nonparametric methods easier to apply and interpret, as they do not depend on parameters like mean or standard deviation. One common approach in nonparametric analysis is to sort data according to a specific criterion. For instance, we might arrange weather data from hottest to coldest days in a month or rank cities...
578
Wald-Wolfowitz Runs Test I01:17

Wald-Wolfowitz Runs Test I

1.1K
The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...
1.1K
Randomized Experiments01:13

Randomized Experiments

9.3K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
9.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Extreme events of quantum walks on networks.

Physical review. E·2026
Same author

Flux-fluctuation relation for quantum walks on networks.

Physical review. E·2026
Same author

Congestion and extreme events in urban street networks.

Chaos (Woodbury, N.Y.)·2026
Same author

Probing the localization effects in Krylov basis.

Physical review. E·2025
Same author

Universal Statistics of Competition in Democratic Elections.

Physical review letters·2025
Same author

Continuous gated first-passage processes.

Reports on progress in physics. Physical Society (Great Britain)·2024
Same journal

Tension on dsDNA bound to ssDNA-RecA filaments may play an important role in driving efficient and accurate homology recognition and strand exchange.

Physical review. E, Statistical, nonlinear, and soft matter physics·2016
Same journal

Publisher's Note: Amplitude-phase coupling drives chimera states in globally coupled laser networks [Phys. Rev. E 91, 040901(R) (2015)].

Physical review. E, Statistical, nonlinear, and soft matter physics·2016
Same journal

Erratum: Shapes of sedimenting soft elastic capsules in a viscous fluid [Phys. Rev. E 92, 033003 (2015)].

Physical review. E, Statistical, nonlinear, and soft matter physics·2016
Same journal

Erratum: Attenuation of excitation decay rate due to collective effect [Phys. Rev. E 90, 022142 (2014)].

Physical review. E, Statistical, nonlinear, and soft matter physics·2016
Same journal

Publisher's Note: Role of connectivity and fluctuations in the nucleation of calcium waves in cardiac cells [Phys. Rev. E 92, 052715 (2015)].

Physical review. E, Statistical, nonlinear, and soft matter physics·2016
Same journal

Publisher's Note: Lattice Boltzmann approach for complex nonequilibrium flows [Phys. Rev. E 92, 043308 (2015)].

Physical review. E, Statistical, nonlinear, and soft matter physics·2016
See all related articles

Related Experiment Video

Updated: Mar 31, 2026

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.4K

Random matrix approach to categorical data analysis.

Aashay Patil1, M S Santhanam1

  • 1Indian Institute of Science Education and Research, Dr. Homi Bhabha Road, Pune 411 008, India.

Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics
|October 15, 2015
PubMed
Summary
This summary is machine-generated.

We introduce a novel similarity matrix for analyzing categorical data, crucial for understanding large datasets from social media and rankings. Statistical analysis reveals its spectra align with random matrix theory, except for the main eigenvalue.

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

8.1K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

3.1K

Related Experiment Videos

Last Updated: Mar 31, 2026

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.4K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

8.1K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

3.1K

Area of Science:

  • Statistics
  • Data Science
  • Social Sciences

Background:

  • Correlation and similarity measures are fundamental across scientific disciplines.
  • Categorical data, comprising qualitative descriptors, presents unique analytical challenges.
  • The proliferation of public domain categorical data necessitates robust analytical tools.

Purpose of the Study:

  • To define and investigate a similarity matrix specifically for categorical data.
  • To analyze the statistical properties of these similarity matrices.
  • To demonstrate the applicability of the proposed method on real-world datasets.

Main Methods:

  • Definition and theoretical study of a similarity matrix for categorical data.
  • Spectral analysis of the similarity matrices.
  • Application of random matrix theory to predict statistical properties.
  • Empirical validation using Indian general election data and North Atlantic sea level pressure data.

Main Results:

  • The statistical properties of the spectra of similarity matrices for categorical data largely conform to random matrix predictions.
  • A notable exception was observed in the dominant eigenvalue of the spectra.
  • The approach proved effective when applied to diverse real-world datasets.

Conclusions:

  • The developed similarity matrix offers a viable method for analyzing categorical data.
  • The findings support the utility of random matrix theory in understanding the spectral properties of similarity matrices derived from categorical data.
  • The method's successful application highlights its potential for analyzing large-scale categorical datasets in various scientific fields.