Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Binomial Probability Distribution

Binomial Probability Distribution

A binomial distribution is a probability distribution for a procedure with a fixed number of trials, where each trial can have only two outcomes.
The outcomes of a binomial experiment fit a binomial probability distribution. A statistical experiment can be classified as a binomial experiment if the following conditions are met:
There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.
There are only two possible outcomes,...

How Data are Classified: Numerical Data

How Data are Classified: Numerical Data

Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...

How Data are Classified: Categorical Data

How Data are Classified: Categorical Data

A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...

Data: Types and Distribution

Data: Types and Distribution

In biostatistics, data are the observations collected for analysis. There are two main types: parametric and non-parametric. Parametric data, which include continuous (e.g., weight) and discrete numerical data (e.g., number of tablets), assume a particular distribution pattern, often the normal distribution. Non-parametric data do not adhere to a specific distribution and typically comprise nominal (e.g., gender) and ordinal categorical data (e.g., pain scale ratings).
Distributions in...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Minimal-invasive, ablative surgery - Potential and limitations for a curative treatment approach in epilepsy.

Epilepsy research·2018

Same author

Delayed repair of open depressed skull fracture.

Pediatric neurosurgery·2000

Same author

Effects of endurance exercise on bone histomorphometric parameters in intact and ovariectomized rats.

Bone and mineral·1994

Same author

Effects of endurance exercise on bone mass and mechanical properties in intact and ovariectomized rats.

Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research·1993

Same author

Effects of two non-endurance exercise protocols on established bone loss in ovariectomized adult rats.

Calcified tissue international·1993

Same author

The consultation process and physician satisfaction: review of referral patterns in three urban family practice units.

Canadian Medical Association journal·1978

Same journal

Bayesian Machine Learning Tools for Alcohol Use Disorder Research: The bpaup R Package.

Multivariate behavioral research·2026

Same journal

A Unified Framework for Jointly modelling Response Times and Item Position Effects in Computer-Based Learning Assessments.

Multivariate behavioral research·2026

Same journal

Generalizability Theory Applied to Daily Relationship Quality: Substantive and Statistical Directions.

Multivariate behavioral research·2026

Same journal

A Modularized Higher-Order Diagnostic Classification Model for Clustered Attribute Hierarchies.

Multivariate behavioral research·2026

Same journal

Generalizing Causal Effects to a Target Population Without Individual-Level Data from the Target Population.

Multivariate behavioral research·2026

Same journal

betaselectr: Selective (and Proper) Standardization in Structural Equation Models.

Multivariate behavioral research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 26, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Some Statistical Considerations In Clustering With Binary Data.

Multivariate Behavioral Research

|January 30, 2016

Summary

This summary is machine-generated.

This study introduces a statistical theory for cluster homogeneity using binary variables. It provides methods for testing cluster homogeneity and deriving metric distances for binary data.

More Related Videos

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Related Experiment Videos

Last Updated: Mar 26, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Area of Science:

Statistics
Data Analysis
Cluster Analysis

Background:

Assessing cluster homogeneity is crucial for understanding data structure.
Existing methods may not adequately address binary variables.
Object data often comprises binary (0,1) attributes.

Purpose of the Study:

To develop a statistical theory for cluster homogeneity with binary data.
To provide test statistics for evaluating cluster homogeneity.
To establish a framework for metric distance derivation with binary variables.

Main Methods:

Utilized two test statistics proposed by Tryon and Bailey (1970).
Derived the exact sampling distribution for H2,r (squared homogeneity for cluster g on variable r).
Derived formulas for the mean and variance of H2 (overall homogeneity for cluster g).

Main Results:

The derived sampling distribution allows for significance testing under random assortment.
Formulas for mean and variance of H2 enable significance tests.
A framework for metric distances between objects with binary scores is proposed.

Conclusions:

The developed statistical theory offers robust methods for analyzing cluster homogeneity with binary data.
The proposed statistics and framework facilitate more accurate data analysis and interpretation.
This work contributes to the understanding of cluster analysis for binary datasets.