Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This number is...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A nature-inspired osprey optimization based feature selection framework with stacking ensemble model for imbalance-aware credit card fraud detection.

Scientific reports·2026

Same author

Bald eagle-optimized transformer networks with temporal-spatial mid-level features for pancreatic tumor classification.

Biomedical physics & engineering express·2025

Same author

An optimized LSTM-based deep learning model for anomaly network intrusion detection.

Scientific reports·2025

Same author

Tumor thickness and depth of invasion in squamous cell carcinoma of tongue as indicators of the loco-regional spread of the disease: A preliminary study.

Journal of oral biology and craniofacial research·2024

Same author

Developing a novel stock index trend predictor model by integrating multiple criteria decision-making with an optimized online sequential extreme learning machine.

Granular computing·2024

Same author

Imbalanced ECG signal-based heart disease classification using ensemble machine learning technique.

Frontiers in big data·2022

Same journal

Mammalian Respiratory Chain Complex Assemblies and Their Links to Mitochondria Stress-Induced Human Diseases.

Advances in experimental medicine and biology·2026

Same journal

Enzyme Assemblies in Nucleotide Metabolism: Structure, Regulation, and Disease Implications.

Advances in experimental medicine and biology·2026

Same journal

The Pyruvate Dehydrogenase Complex: A 90-Year-Old Enigma Shaping the Future of Structural Enzymology.

Advances in experimental medicine and biology·2026

Same journal

Regulation of the Anti-termination RNA Transcription Complex by Lon-Mediated Lambda N Degradation.

Advances in experimental medicine and biology·2026

Same journal

PCNA Macromolecular Complexes: PCNA Serves as a Molecular Hub Regulating Multiple Cellular Processes Inside and Outside of the Nucleus.

Advances in experimental medicine and biology·2026

Same journal

Dynamic Assemblies in Genome Maintenance.

Advances in experimental medicine and biology·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 3, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Feature selection in gene expression data using principal component analysis and rough set theory.

Debahuti Mishra¹, Rajashree Dash, Amiya Kumar Rath

¹Department of Computer Science & Engineering, Institute of Technical Education & Research, Siksha O Anusandhan University, Bhubaneswar, Orissa, India. debahuti@iter.ac.in

Advances in Experimental Medicine and Biology

|March 25, 2011

Summary

This summary is machine-generated.

This study introduces Rough PCA, a novel feature selection method combining Principal Component Analysis and Rough Set Theory. Rough PCA effectively reduces high-dimensional data, enhancing classification accuracy in fields like machine learning.

More Related Videos

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples

Published on: June 8, 2020

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Related Experiment Videos

Last Updated: Jun 3, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples

Published on: June 8, 2020

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Generating the Transcriptional Regulation View of Transcriptomic Features for Prediction Task and Dark Biomarker Detection on Small Datasets

Published on: March 1, 2024

Area of Science:

Data Mining
Machine Learning
Pattern Recognition
Signal Processing

Background:

High-dimensional data with numerous features is common in data mining and machine learning.
Feature selection is crucial for preprocessing such data to improve classification tasks.
Traditional methods include Feature Extraction (FE) and Feature Selection (FS).

Purpose of the Study:

To develop an effective feature selection method for high-dimensional data.
To combine Principal Component Analysis (PCA) with Rough Set Theory for improved feature selection.
To enhance classification accuracy by selecting the most adequate principal components.

Main Methods:

Principal Component Analysis (PCA): An unsupervised linear FE method for dimensionality reduction by identifying directions of maximal variance.
Rough Set Theory: A method for discovering data dependencies and reducing attributes using data alone.
Rough PCA: A hybrid approach combining PCA for feature extraction and Rough Set Theory for feature selection from principal components.

Main Results:

The proposed Rough PCA method successfully identifies principal features from high-dimensional datasets.
Upper and Lower Approximations from Rough Set Theory were applied to find a reduced set of features.
The method was validated on gene expression data, demonstrating its effectiveness.

Conclusions:

Rough PCA offers a robust approach for feature selection in high-dimensional data.
The joint application of PCA and Rough Set Theory ensures selected features are highly discriminative for classification.
The method shows promise for applications in bioinformatics, particularly with gene expression data analysis.