Search research articles

Related Concept Videos

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Structures of Solids

Structures of Solids

Solids in which the atoms, ions, or molecules are arranged in a definite repeating pattern are known as crystalline solids. Metals and ionic compounds typically form ordered, crystalline solids. A crystalline solid has a precise melting temperature because each atom or molecule of the same type is held in place with the same forces or energy. Amorphous solids or non-crystalline solids (or, sometimes, glasses) which lack an ordered internal structure and are randomly arranged. Substances that...

Protein and Protein Structure

Protein and Protein Structure

Proteins are one of the most abundant organic molecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective. They may serve in transport, storage, or membranes; or they may be toxins or enzymes. Their structures, like their functions, vary greatly. They are all, however, amino acid polymers arranged in a linear sequence.
A protein's shape is critical to its function. For example, an enzyme...

Dimensional Analysis

Dimensional Analysis

Dimensional analysis, also known as the factor label method, is a versatile approach for mathematical operations. The main principle behind this approach is: the units of quantities must be subjected to the same mathematical operations as their associated numbers. This method can be applied to computations ranging from simple unit conversions to more complex and multi-step calculations involving several different quantities and their units.
Conversion Factors and Dimensional Analysis
The unit...

How Data are Classified: Categorical Data

How Data are Classified: Categorical Data

A variable, usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population. Data are the actual values of variables. They may be numbers, or they may be words. Datum is a single value.
Data are classified based on whether they are measurable or not. Categorical data cannot be measured; instead, it can be divided into categories. For example, if Y denotes a person's party affiliation, some examples of Y include...

How Data are Classified: Numerical Data

How Data are Classified: Numerical Data

Data that are countable or measurable in specific units are called numerical or quantitative data. Quantitative data are always numbers. Quantitative data are the result of counting or measuring the attributes of a population. Amount of money, pulse rate, weight, number of people living in a town, and number of students who opt for statistics are examples of quantitative data.
Quantitative data may be either discrete or continuous. All quantitative data that take on only specific numerical...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Beyond Brownian Motion and the Ornstein-Uhlenbeck Process: Stochastic Diffusion Models for the Evolution of Quantitative Characters.

The American naturalist·2020

Same author

Clustering.

Methods in molecular biology (Clifton, N.J.)·2016

Same author

A benchmark for evaluation of algorithms for identification of cellular correlates of clinical outcomes.

Cytometry. Part A : the journal of the International Society for Analytical Cytology·2015

Same author

Clustering of gene expression data via normal mixture models.

Methods in molecular biology (Clifton, N.J.)·2013

Same author

On the classification of microarray gene-expression data.

Briefings in bioinformatics·2012

Same author

Nonlinear features for single-channel diagnosis of sleep-disordered breathing diseases.

IEEE transactions on bio-medical engineering·2010

Same journal

Correction.

Journal of biopharmaceutical statistics·2026

Same journal

Leveraging external controls in clinical trials: estimands, estimation, assumptions.

Journal of biopharmaceutical statistics·2026

Same journal

Special issue of nonclinical statistics in regulatory applications guest editors' notes.

Journal of biopharmaceutical statistics·2026

Same journal

Comparison of flexible parametric modeling and nonparametric methods to estimate restricted mean survival time: A simulation study.

Journal of biopharmaceutical statistics·2026

Same journal

Simulated treatment comparisons with jackknife pseudo values for estimating population-adjusted marginal treatment effects.

Journal of biopharmaceutical statistics·2026

Same journal

Sample sizes for randomized controlled trials utilizing Bayesian response adaptive randomization for continuous outcomes.

Journal of biopharmaceutical statistics·2026

See all related articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Feb 10, 2026

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Published on: October 18, 2024

Testing for group structure in high-dimensional data.

G J McLachlan¹, Suren I Rathnayake

¹Department of Mathematics, University of Queensland, St. Lucia, Queensland, Australia. g.mclachlan@uq.edu.au

Journal of Biopharmaceutical Statistics

|October 26, 2011

Summary

This summary is machine-generated.

Determining the optimal number of clusters in high-dimensional data using finite mixture models is challenging. This study evaluates a resampling method, assessing potential bias from dimension reduction in clustering analysis.

More Related Videos

Three-Dimensional Shape Modeling and Analysis of Brain Structures

Three-Dimensional Shape Modeling and Analysis of Brain Structures

Published on: November 14, 2019

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Related Experiment Videos

Last Updated: Feb 10, 2026

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Published on: October 18, 2024

Three-Dimensional Shape Modeling and Analysis of Brain Structures

Three-Dimensional Shape Modeling and Analysis of Brain Structures

Published on: November 14, 2019

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Area of Science:

Statistics
Machine Learning
Data Mining

Background:

Finite mixture models are used for data clustering.
Determining the correct number of clusters is a critical challenge.
High-dimensional data (p >> n) presents unique difficulties for model fitting.

Purpose of the Study:

To investigate the performance of a resampling approach for determining the number of components in mixture models.
To assess the impact of dimension reduction on this resampling method in high-dimensional settings.

Main Methods:

Utilizing finite mixture models for clustering.
Applying a resampling (bootstrapping) approach to test for the number of components.
Performing dimension reduction techniques to handle high-dimensional data.
Comparing results from bootstrapping on reduced data versus full data.

Main Results:

The study examines the potential for bias when bootstrapping is performed only on dimension-reduced data.
Performance evaluation of the resampling approach in the context of high-dimensional data.

Conclusions:

Dimension reduction is necessary for fitting normal mixture models to high-dimensional data.
The research questions the practical significance of bias introduced by performing bootstrapping solely on reduced data.