Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cluster Sampling Method01:20

Cluster Sampling Method

13.5K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
13.5K
Statistical Software for Data Analysis and Clinical Trials01:12

Statistical Software for Data Analysis and Clinical Trials

1.1K
Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...
1.1K
Sampling Plans01:23

Sampling Plans

565
Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...
565
Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

678
Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
678
Study Design in Statistics01:15

Study Design in Statistics

9.7K
A study design is a set of techniques that allow a researcher to collect and analyze data from different variables defined for a specific research problem. Statistics is commonly for effective study design and more robust experiments,
Does aspirin reduce the risk of heart attacks? Is one brand of fertilizer more effective at growing roses than another? Is fatigue as dangerous to a driver as the influence of alcohol? Questions like these are answered using randomized experiments with proper...
9.7K
Clinical Trials01:16

Clinical Trials

9.9K
Clinical trials are prospective experimental studies conducted on humans to determine the safety and efficacy of treatments, drugs, diet methods, and medical devices. Using statistics in clinical trials enables researchers to derive reasonable and accurate conclusions from the collected data, allowing them to make wise decisions in uncertain situations. In medical research, statistical methods are crucial for preventing errors and bias.
There are four phases in a clinical trial. A phase one...
9.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

WayFindR: investigating feedback in biological pathways.

NAR genomics and bioinformatics·2026
Same author

Clustering Digestive Tract Tumors Using Transcriptomic and Mutation Data.

Cancers·2026
Same author

The rumination severity index: Development and evaluation of a scoring tool for rumination syndrome.

Journal of pediatric gastroenterology and nutrition·2026
Same author

Improving Power of the Win Ratio Analysis through Distance-based Weights.

Statistics in medicine·2026
Same author

Safety and efficacy of droxidopa for dysautonomia in adults with Menkes disease and occipital horn syndrome in the USA: a randomised phase 1/2a crossover trial.

EClinicalMedicine·2026
Same author

An AI-based chatbot to support health-related social needs among pediatric primary care population: Protocol for a pilot randomized controlled trial.

PloS one·2026
Same journal

Evaluation of temporal preservation in synthetic longitudinal patient data.

Journal of biomedical informatics·2026
Same journal

ARKE: An ontology-driven framework for automated mapping of local radiology procedure terms to the LOINC-RadLex playbook using large language model.

Journal of biomedical informatics·2026
Same journal

A validation-driven training controller for cross-lingual biomedical NER via reinforcement learning-based adaptive loss weighting.

Journal of biomedical informatics·2026
Same journal

ASP-HR: An Adaptive Spatial Perception and Hierarchical Reasoning mechanism for document-level biomedical relation extraction.

Journal of biomedical informatics·2026
Same journal

Beyond Accuracy: Safety-Centered guidelines for the evaluation of LLM-based therapy recommendation systems for chronic multimorbidity patients.

Journal of biomedical informatics·2026
Same journal

DeepEN: A deep reinforcement learning framework for personalized enteral nutrition in critical care.

Journal of biomedical informatics·2026
See all related articles

Related Experiment Video

Updated: Nov 9, 2025

Author Spotlight: Evaluating Clinicians' Adoption of Ultrasound-Guided Vascular Cannulation Through Simulation Training
05:04

Author Spotlight: Evaluating Clinicians' Adoption of Ultrasound-Guided Vascular Cannulation Through Simulation Training

Published on: August 9, 2024

1.2K

Simulation-derived best practices for clustering clinical data.

Caitlin E Coombes1, Xin Liu2, Zachary B Abrams3

  • 1The Ohio State University College of Medicine, 370 W 9th Ave, Columbus, OH 43210, USA.

Journal of Biomedical Informatics
|April 16, 2021
PubMed
Summary
This summary is machine-generated.

Choosing the right distance metric is crucial for accurate patient clustering in clinical data. The DAISY metric with hierarchical clustering (HC) effectively identifies distinct patient groups, improving disease understanding and precision medicine.

Keywords:
Clinical informaticsClinical trialClusteringElectronic health recordUnsupervised machine learning

More Related Videos

In Silico Clinical Trials for Cardiovascular Disease
09:09

In Silico Clinical Trials for Cardiovascular Disease

Published on: May 27, 2022

2.0K
Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.9K

Related Experiment Videos

Last Updated: Nov 9, 2025

Author Spotlight: Evaluating Clinicians' Adoption of Ultrasound-Guided Vascular Cannulation Through Simulation Training
05:04

Author Spotlight: Evaluating Clinicians' Adoption of Ultrasound-Guided Vascular Cannulation Through Simulation Training

Published on: August 9, 2024

1.2K
In Silico Clinical Trials for Cardiovascular Disease
09:09

In Silico Clinical Trials for Cardiovascular Disease

Published on: May 27, 2022

2.0K
Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.9K

Area of Science:

  • Clinical informatics
  • Data science in healthcare
  • Biostatistics

Background:

  • Clustering analyses are vital for understanding patient phenotypes and disease trajectories in clinical medicine.
  • Ensuring rigor, validity, and reproducibility in clinical clustering solutions is an ongoing challenge.
  • Best practices for dissimilarity matrix calculation and clustering on mixed-type clinical data require evaluation.

Purpose of the Study:

  • To evaluate best practices for dissimilarity matrix calculation and clustering on mixed-type clinical data.
  • To compare the performance of various distance metrics and clustering algorithms on simulated and real-world clinical datasets.
  • To identify optimal methods for enhancing patient subclassification and precision medicine.

Main Methods:

  • Simulated clinical data (binary, continuous, categorical, and mixtures) were used to test 5 single distance metrics and 3 mixed distance metrics.
  • Clustering was performed using hierarchical clustering (HC), k-medoids, and self-organizing maps (SOM).
  • Performance was validated using Adjusted Rand Index (ARI) and silhouette width (SW) on simulated and two real-world datasets (chronic lymphocytic leukemia and intensive care unit admissions).

Main Results:

  • Hierarchical clustering (HC) demonstrated superior performance over k-medoids and SOM, evidenced by higher ARI across data types.
  • The DAISY mixed-type distance metric yielded the highest mean ARI for most mixed data types.
  • DAISY combined with HC identified superior, separable clusters in both real-world clinical datasets.

Conclusions:

  • The selection of appropriate mixed-type distance metrics is essential for optimal patient cluster separation and data utilization.
  • Advanced metrics capable of handling multiple data types enhance the subclassification of diseases.
  • Improved disease subclassification facilitates targeted treatments, precision medicine, clinical decision support, and better patient outcomes.