Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Antibacterial Activity of ZnBTC-MOF Combined With Essential Oil From Etlingera elatior and Major Compounds Against Staphylococcus aureus and Escherichia coli.

Chemistry & biodiversity·2025

Same author

Big team science reveals promises and limitations of machine learning efforts to model physiological markers of affective experience.

Royal Society open science·2025

Same author

Machine learning classification based on k-Nearest Neighbors for PolSAR data.

Anais da Academia Brasileira de Ciencias·2024

Same author

On the Use of Machine Learning Techniques and Non-Invasive Indicators for Classifying and Predicting Cardiac Disorders.

Biomedicines·2023

Same author

Similarity-Based Predictive Models: Sensitivity Analysis and a Biological Application with Multi-Attributes.

Biology·2023

Same author

Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery.

Biology·2023

Same journal

Analysis of strength degradation of coal and rock masses and stability of mined areas under long term immersion environment.

PloS one·2026

Same journal

Biogenic Silver-Selenium nanocomposite with anticancer activity and potent efficacy against vancomycin-resistant Staphylococcus aureus.

PloS one·2026

Same journal

Preparation and physicochemical characterization of a biodegradable chitosan/carboxymethyl cellulose hydrogel synthesized in NaOH/urea medium.

PloS one·2026

Same journal

Action-guilt, survivor-guilt, and depression in combat-related PTSD.

PloS one·2026

Same journal

Explainable machine learning for predicting activities of daily living at discharge in stroke patients: A retrospective study using SHAP interpretability.

PloS one·2026

Same journal

Deep learning based two-way feature depiction model for brain tumor detection.

PloS one·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 13, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Adaptive kernel fuzzy clustering for missing data.

Anny K G Rodrigues¹, Raydonal Ospina¹, Marcelo R P Ferreira²

¹Departamento de Estatística, CASTLab, CCEN, Universidade Federal de Pernambuco, Cidade Universitária, Recife, PE, Brazil.

|November 12, 2021

Summary

This summary is machine-generated.

This study introduces a Kernel Fuzzy C-means algorithm to handle missing data in clustering. The Optimal Completion Strategy (OCS) demonstrated superior performance in estimating missing values and improving clustering accuracy.

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Related Experiment Videos

Last Updated: Oct 13, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

Machine learning
Data mining
Clustering analysis

Background:

Missing values are a common challenge in machine learning, impacting clustering algorithm performance.
Existing methods for handling missing data in clustering can lead to suboptimal results.

Purpose of the Study:

To propose and evaluate a Kernel Fuzzy C-means algorithm (VKFCM-K-LP) for clustering with missing data.
To compare three strategies for handling missing data: Whole Data Strategy (WDS), Partial Distance Strategy (PDS), and Optimal Completion Strategy (OCS).

Main Methods:

The study utilizes a Kernel Fuzzy C-means algorithm incorporating local adaptive distances.
Three distinct strategies (WDS, PDS, OCS) were implemented to address missing values within the clustering framework.
Performance was evaluated using various clustering metrics.

Main Results:

The Partial Distance Strategy (PDS) and Optimal Completion Strategy (OCS) significantly outperformed the Whole Data Strategy (WDS).
The Optimal Completion Strategy (OCS) dynamically estimated missing values during optimization, yielding superior clustering results.
OCS-based clustering surpassed results obtained from imputing missing values with mean or median.

Conclusions:

The proposed Kernel Fuzzy C-means algorithm with the Optimal Completion Strategy is effective for clustering incomplete datasets.
Dynamic estimation of missing values within the objective function offers a robust approach to handling data gaps.
The OCS strategy provides a significant improvement over traditional imputation methods for clustering tasks.