Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

Fisher's Exact Test

Fisher's Exact Test

Fisher's exact test is a statistical significance test widely used to analyze 2x2 contingency tables, particularly in situations where sample sizes are small. Unlike the chi-squared test, which approximates P-values and assumes minimum expected frequencies of at least five in each cell, Fisher's exact test calculates the exact probability (P-value) of observing the data or more extreme results under the null hypothesis. This feature makes it especially valuable when the assumptions of...

Friedman Two-way Analysis of Variance by Ranks

Friedman Two-way Analysis of Variance by Ranks

Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Identifying Single-Cell Expression Quantitative Trait Loci Using a Bootstrap Penalized Hurdle Model.

Genes·2026

Same author

A Novel Bioinformatics Pipeline and a Machine-Learning Approach for Antimicrobial Resistance Phenotypic Prediction.

Bioinformatics and biology insights·2026

Same author

Unpacking X (formerly Twitter) discourse on fluoride and related topics during the 2024 US presidential election.

Journal of the American Dental Association (1939)·2026

Same author

Deciphering sepsis molecular subtypes using large-scale data to identify subtype-specific drug repurposing.

bioRxiv : the preprint server for biology·2026

Same author

Nonparametric estimation of a state entry time distribution conditional on a "past" state occupation in a progressive multistate model with current status data.

Lifetime data analysis·2026

Same author

Impact of over-contoured restorations on the marginal bone around tissue-level implants: A retrospective radiographic analysis.

Journal of periodontology·2026

Same journal

Assessment of lower incisor position and symphysis dimensions among different skeletal patterns in the Chhattisgarh population.

Bioinformation·2026

Same journal

Low T3 syndrome and short-term outcomes in patients with acute decompensated heart failure: A retrospective observational study.

Bioinformation·2026

Same journal

Cardiovascular risk prevention awareness and practices in type 2 diabetes: Linking HbA1c and lipid levels.

Bioinformation·2026

Same journal

Assessment of periodontal condition using basic periodontal examination scores: A retrospective clinical study.

Bioinformation·2026

Same journal

Comparative evaluation of osseointegration among different surface modification techniques in dental implants.

Bioinformation·2026

Same journal

Micro-osteoperforations' impact on orthodontic tooth movement rate: Split mouth research.

Bioinformation·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 1, 2026

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

optCluster: An R Package for Determining the Optimal Clustering Algorithm.

Michael Sekula¹, Somnath Datta², Susmita Datta²

¹Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, Kentucky, 40202, USA.

|June 7, 2017

Summary

This summary is machine-generated.

Determining the best clustering solution can be challenging. The optCluster R package objectively aggregates multiple validation measures using weighted rank aggregation, simplifying the selection of optimal clustering for genomic data.

Keywords:

Clustering Gene Expression RNA-Seq Validation

More Related Videos

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Divergence of Root Microbiota in Different Habitats based on Weighted Correlation Networks

Divergence of Root Microbiota in Different Habitats based on Weighted Correlation Networks

Published on: September 25, 2021

Related Experiment Videos

Last Updated: Mar 1, 2026

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Divergence of Root Microbiota in Different Habitats based on Weighted Correlation Networks

Divergence of Root Microbiota in Different Habitats based on Weighted Correlation Networks

Published on: September 25, 2021

Area of Science:

Bioinformatics
Computational Biology
Data Science

Background:

Numerous clustering validation tools exist, but comparing multiple solutions using various measures is often subjective.
Visual inspection of clustering results can be insufficient for determining the optimal partition.
The optCluster R package addresses the need for objective evaluation of clustering performance.

Purpose of the Study:

To introduce optCluster, an R package designed for simultaneous comparison of multiple clustering partitions.
To provide an objective method for selecting the best clustering solution from various algorithms and cluster numbers.
To facilitate the analysis of genomic and RNA sequencing data.

Main Methods:

Utilizes weighted rank aggregation to objectively combine scores from diverse performance measures.
Offers a single-function interface for comparing numerous clustering partitions.
Incorporates biological validation measures and specialized algorithms for RNA sequencing data.

Main Results:

Provides an objective approach to selecting the optimal clustering solution, removing guesswork from visual inspection.
Enables simultaneous comparison of clustering partitions generated by different algorithms and cluster counts.
Streamlines the process of identifying the best clustering for a given dataset.

Conclusions:

The optCluster package offers a robust and objective method for clustering validation, particularly for genomic data.
Its weighted rank aggregation approach simplifies the selection of optimal clustering solutions.
It serves as a valuable tool for researchers working with RNA sequencing and other genomic datasets.