Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Random Variables

Random Variables

A random variable is a single numerical value that indicates the outcome of a procedure. The concept of random variables is fundamental to the probability theory and was introduced by a Russian mathematician, Pafnuty Chebyshev, in the mid-nineteenth century.
Uppercase letters such as X or Y denote a random variable. Lowercase letters like x or y denote the value of a random variable. If X is a random variable, then X is written in words, and x is given as a number.
For example, let X = the...

Random Sampling Method

Random Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Wald-Wolfowitz Runs Test I

Wald-Wolfowitz Runs Test I

The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Random Error

Random Error

Random or indeterminate errors originate from various uncontrollable variables, such as variations in environmental conditions, instrument imperfections, or the inherent variability of the phenomena being measured. Usually, these errors cannot be predicted, estimated, or characterized because their direction and magnitude often vary in magnitude and direction even during consecutive measurements. As a result, they are difficult to eliminate. However, the aggregate effect of these errors can be...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

MSGM: a multi-scale spatiotemporal graph Mamba for EEG emotion recognition.

Frontiers in neuroscience·2026

Same author

PathOrchestra: a comprehensive foundation model for computational pathology with over 100 diverse clinical-grade tasks.

NPJ digital medicine·2025

Same author

Application of the BOPPPS combined with CBL method in clinical clerkship of pediatric dentistry.

BMC medical education·2025

Same author

Molecular Biological Comparison of Pulp Stem Cells from Supernumerary Teeth, Permanent Teeth, and Deciduous Teeth for Endodontic Regeneration.

International journal of molecular sciences·2025

Same author

The Role of Infiltrated T Lymphocyte in Oral Squamous Cell Carcinoma: Insights into Clinicopathological Characteristics and Prognosis.

Journal of inflammation research·2024

Same author

Hypergraph-Based Multi-View Action Recognition Using Event Cameras.

IEEE transactions on pattern analysis and machine intelligence·2024

Same journal

DARUMA: a gateway to fast and easy prediction of intrinsically disordered regions.

PeerJ. Computer science·2026

Same journal

Alzheimer's disease detection using a quantum deep neural network with Haralick feature extraction and simulated annealing optimization.

PeerJ. Computer science·2026

Same journal

Network anomaly detection using Deep Autoencoder and parallel Artificial Bee Colony algorithm-trained neural network.

PeerJ. Computer science·2026

Same journal

An anomaly detection model for multivariate time series with anomaly perception.

PeerJ. Computer science·2026

Same journal

Retraction: A wormhole attack detection method for tactical wireless sensor networks.

PeerJ. Computer science·2026

Same journal

Evaluation of mental disorder with prioritization of its type by utilizing the bipolar complex fuzzy decision-making approach based on Schweizer-Sklar prioritized aggregation operators.

PeerJ. Computer science·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 29, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Random k conditional nearest neighbor for high-dimensional data.

Jiaxuan Lu¹, Hyukjun Gweon¹

¹University of Western Ontario, London, ON, Canada.

Peerj. Computer Science

|February 3, 2025

Summary

This summary is machine-generated.

This study enhances the k-conditional nearest neighbor (kCNN) algorithm for better classification performance, especially in high-dimensional datasets with noisy features. The new method aggregates multiple kCNN classifiers for improved predictive accuracy.

Keywords:

High-dimensional data K nearest neighbor Nonparametric classification

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Related Experiment Videos

Last Updated: May 29, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Area of Science:

Machine Learning
Bioinformatics
Computational Biology

Background:

The k-nearest neighbor (kNN) algorithm is widely used for classification but struggles with high-dimensional and noisy data.
Existing kNN variants may not effectively handle non-informative features or the curse of dimensionality.
The k-conditional nearest neighbor (kCNN) method offers improvements but can be further optimized.

Purpose of the Study:

To address the limitations of kNN and kCNN in high-dimensional and noisy datasets.
To propose an enhanced kCNN approach by aggregating multiple classifiers built on feature subsets.
To introduce a scoring metric for weighting individual classifiers based on feature subset separation.

Main Methods:

Extension of the k-conditional nearest neighbor (kCNN) algorithm.
Aggregation of multiple kCNN classifiers, each trained on a randomly sampled feature subset.
Development of a score metric to weigh the contribution of each classifier.

Main Results:

Simulation studies investigated the properties of the proposed method.
Experiments on gene expression datasets demonstrated promising predictive classification performance.
The proposed ensemble approach shows potential for handling high-dimensional data with noisy features.

Conclusions:

The proposed aggregated kCNN method effectively addresses kNN limitations in high-dimensional and noisy data.
The method shows promise for improving classification accuracy in complex biological datasets.
Further research can explore the application of this technique in other domains requiring robust classification.