Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

What Are Outliers?

What Are Outliers?

Outliers are observed data points that are far from the least squares line. They have unusual values and need to be examined carefully. Though an outlier may result from erroneous data, at other times, it may hold valuable information about the population under study and should be included in the data. Hence, it is crucial to examine what causes a data point to be an outlier.
The z score is used to find outliers or unusual values. It should be noted that any values beyond -2 and +2 are...

Outliers and Influential Points

Outliers and Influential Points

An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...

Modified Boxplots

Modified Boxplots

A standard box and whisker plot informs us about the spread of the data in a given sample. One can identify the minimum value, maximum value, first quartile value, second quartile or median value, and third quartile.
However, the box plot does not tell the reader about outliers - values that lie far from the center of the data. We can modify the standard box and whisker plot to identify the outliers and visualize the actual spread of the data in a sample.
Initially, we calculate the adjusted...

Significance Testing: Overview

Significance Testing: Overview

Significance testing is a set of statistical methods used to test whether a claim about a parameter is valid. In analytical chemistry, significance testing is used primarily to determine whether the difference between two values comes from determinate or random errors. The effect of a particular change in the measurement protocol, analyst, or sample itself can cause a deviation from the expected result. In the case of a suspected deviation/outlier, we need to be able to confirm mathematically...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Distortion-Aware Routing and Parameter-Shared MoE for Multispectral Remote Sensing Super-Resolution.

Sensors (Basel, Switzerland)·2026

Same author

Risk-Sensitive Machine Learning for Financial Decision Modeling Under Imbalanced Data: Evidence from Bank Telemarketing.

Entropy (Basel, Switzerland)·2026

Same author

DWI-derived intratumoral, peritumoral, and habitat features for preoperative prediction of lymph node metastasis in early-stage cervical cancer using machine learning method.

Abdominal radiology (New York)·2026

Same author

Gear rotation caused by self-propelling eccentric particles.

Physical review. E·2026

Same author

The value of LI-RADS ancillary features and biomarker for hepatocellular carcinoma ≤ 30 mm in Western and Eastern guidelines on extracellular agent and gadoxetic acid-enhanced MRI.

European radiology·2026

Same author

Natural, Engineered, and Hybrid Platelet Membrane-Based Nanotherapeutics for Inflammatory Diseases.

International journal of nanomedicine·2025

Same journal

Analysis of strength degradation of coal and rock masses and stability of mined areas under long term immersion environment.

PloS one·2026

Same journal

Biogenic Silver-Selenium nanocomposite with anticancer activity and potent efficacy against vancomycin-resistant Staphylococcus aureus.

PloS one·2026

Same journal

Preparation and physicochemical characterization of a biodegradable chitosan/carboxymethyl cellulose hydrogel synthesized in NaOH/urea medium.

PloS one·2026

Same journal

Action-guilt, survivor-guilt, and depression in combat-related PTSD.

PloS one·2026

Same journal

Explainable machine learning for predicting activities of daily living at discharge in stroke patients: A retrospective study using SHAP interpretability.

PloS one·2026

Same journal

Deep learning based two-way feature depiction model for brain tumor detection.

PloS one·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 13, 2025

Competitive Genomic Screens of Barcoded Yeast Libraries

Competitive Genomic Screens of Barcoded Yeast Libraries

Published on: August 11, 2011

Hit screening with multivariate robust outlier detection.

Hui Sun Leong¹, Tianhui Zhang², Adam Corrigan¹

¹Data Sciences and Quantitative Biology, Discovery Sciences, Biopharmaceuticals R&D, AstraZeneca, Cambridge, United Kingdom.

|September 12, 2024

Summary

This summary is machine-generated.

Hit screening identifies drug candidates using multivariate assays. A new method, mROUT (multivariate robust outlier detection), effectively identifies hits by detecting outliers in high-dimensional data, improving drug discovery efficiency.

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Related Experiment Videos

Last Updated: Jun 13, 2025

Competitive Genomic Screens of Barcoded Yeast Libraries

Competitive Genomic Screens of Barcoded Yeast Libraries

Published on: August 11, 2011

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

Drug discovery and development
Bioinformatics and computational biology
High-content screening analysis

Background:

Hit screening is crucial for identifying compounds that modulate disease processes.
High-content screening assays generate complex, multivariate data requiring advanced analytical methods.
Conventional univariate approaches are insufficient for analyzing rich, high-dimensional screening data.

Purpose of the Study:

To develop an advanced method for hit identification in multivariate assays.
To address the challenge of analyzing complex, high-dimensional data from phenotypic screening.
To improve the accuracy and reliability of hit detection in drug discovery.

Main Methods:

Developed a novel method, mROUT (multivariate robust outlier detection).
mROUT utilizes principal components and robust Mahalanobis distance for outlier detection.
The method is designed for identifying multivariate hits in high-dimensional datasets.

Main Results:

mROUT demonstrated superior performance in simulation studies compared to existing techniques.
The method effectively maintained Type I error, false discovery rate, and true discovery rate.
mROUT's efficacy was validated on an in-house CRISPR knockout phenotypic screening dataset.

Conclusions:

mROUT provides a robust and accurate approach for hit identification in multivariate assays.
The method enhances the analysis of complex high-content screening data, aiding drug discovery.
mROUT represents a significant advancement in computational methods for phenotypic screening.