Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

4.6K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
4.6K
Outliers and Influential Points01:08

Outliers and Influential Points

6.7K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
6.7K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

7.4K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
7.4K
What Are Outliers?01:12

What Are Outliers?

5.6K
Outliers are observed data points that are far from the least squares line. They have unusual values and need to be examined carefully. Though an outlier may result from erroneous data, at other times, it may hold valuable information about the population under study and should be included in the data. Hence, it is crucial to examine what causes a data point to be an outlier.
The z score is used to find outliers or unusual values. It should be noted that any values beyond -2 and +2 are...
5.6K
Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

397
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
397
Methods of Medium Optimization01:28

Methods of Medium Optimization

49
Optimizing growth media enhances microbial proliferation and maximizes product yield. Statistical experimental design methodologies provide structured and reproducible approaches, offering progressively higher levels of robustness and efficiency.The One-Factor-at-a-Time (OFAT) MethodThe One-Factor-at-a-Time (OFAT) method involves adjusting a single variable while keeping all others constant. However, it cannot detect interactions between variables, often leading to suboptimal outcomes when...
49

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Differential Modulation of GLP-1R by Dietary Ginsenosides Points to a Putative Extracellular Allosteric Site.

International journal of molecular sciences·2026
Same author

Barriers to the Pharmacologic Rescue of W1282X CFTR.

Biochemistry·2025
Same author

Sirt6 prevents the age-related decline of H<sub>2</sub>S through the control of one-carbon metabolism.

Proceedings of the National Academy of Sciences of the United States of America·2025
Same author

RETRACTED: Naamneh et al. Structure-Activity Relationship of Synthetic Linear KTS-Peptides Containing Meta-Aminobenzoic Acid as Antagonists of α1β1 Integrin with Anti-Angiogenic and Melanoma Anti-Tumor Activities. <i>Pharmaceuticals</i> 2024, <i>17</i>, 549.

Pharmaceuticals (Basel, Switzerland)·2025
Same author

Machine Learning-Based Identification of Petroleum Distillates and Gasoline Traces Using Measured and Synthetic GC Spectra from Collected Samples.

Molecular informatics·2025
Same author

Multimodal Inhibition of <i>Pectobacterium brasiliense</i> Virulence by the Citrus Flavanone Naringenin.

Journal of agricultural and food chemistry·2025
Same journal

Correction to "AstraMEV (AI-Guided Structural Assembly of Multi-Epitope Vaccines) Against Infectious Bronchitis Virus".

Journal of chemical information and modeling·2026
Same journal

MolPy: A Large Language Model-Friendly Toolkit for Reactive Topology Editing in Polymer Simulations.

Journal of chemical information and modeling·2026
Same journal

Molecular Mechanisms of KIT Receptor Dimerization and Oncogenic Activation Revealed by Multiscale Simulations.

Journal of chemical information and modeling·2026
Same journal

Structural and Thermodynamic Discrimination between Agonists and Antagonists of Retinoic Acid Receptor γ and the Vitamin D Receptor.

Journal of chemical information and modeling·2026
Same journal

PACEff Builder: An Efficient Platform for Constructing PACE Hybrid-Resolution Models for Molecular Dynamics Simulations of Aqueous Protein, Peptide Assembly, and Membrane Protein Systems.

Journal of chemical information and modeling·2026
Same journal

TransKla: A Local-Global Cross-Attention Based Transformer Approach for Prediction of Lysine Lactylation Sites.

Journal of chemical information and modeling·2026
See all related articles

Related Experiment Video

Updated: Mar 30, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

8.1K

A Multi-Objective Genetic Algorithm for Outlier Removal.

Oren E Nahum1,2,3, Abraham Yosipof4, Hanoch Senderowitz3

  • 1Department of Management, Bar-Ilan University , Ramat-Gan 52900, Israel.

Journal of Chemical Information and Modeling
|November 11, 2015
PubMed
Summary
This summary is machine-generated.

A new multi-objective genetic algorithm effectively identifies and removes outliers in quantitative structure-activity relationship (QSAR) modeling. This method improves QSAR model prediction statistics while preserving data diversity.

More Related Videos

Competitive Genomic Screens of Barcoded Yeast Libraries
11:59

Competitive Genomic Screens of Barcoded Yeast Libraries

Published on: August 11, 2011

18.9K
Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm
11:53

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Published on: December 9, 2012

13.6K

Related Experiment Videos

Last Updated: Mar 30, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

8.1K
Competitive Genomic Screens of Barcoded Yeast Libraries
11:59

Competitive Genomic Screens of Barcoded Yeast Libraries

Published on: August 11, 2011

18.9K
Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm
11:53

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Published on: December 9, 2012

13.6K

Area of Science:

  • Computational chemistry
  • Medicinal chemistry
  • Cheminformatics

Background:

  • Quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR) models correlate compound structures with their activities using mathematical models.
  • Outliers, compounds differing significantly from the dataset, can compromise the statistical accuracy and predictive power of QSAR/QSPR models.
  • Effective outlier removal is crucial for deriving reliable QSAR models with robust prediction statistics.

Purpose of the Study:

  • To introduce a novel multi-objective genetic algorithm for identifying and removing outliers in datasets.
  • To evaluate the algorithm's performance in improving QSAR model statistics and maintaining data diversity.
  • To offer an optional 'preservation' function for retaining specific compounds of interest.

Main Methods:

  • Development of a multi-objective genetic algorithm integrating the k-nearest neighbors (kNN) method for outlier detection.
  • Application of the algorithm to three pharmaceutical datasets: logBBB, factor 7 inhibitors, and dihydrofolate reductase inhibitors.
  • Comparative analysis of the proposed algorithm against five other outlier removal techniques.

Main Results:

  • The new algorithm generated filtered datasets that better preserved the internal diversity of the original data.
  • QSAR models derived from the algorithm's filtered datasets exhibited significantly improved prediction statistics.
  • An added 'preservation' objective function allowed for selective removal of low-probability outliers, retaining compounds with favorable activities or unique scaffolds.

Conclusions:

  • The proposed multi-objective genetic algorithm is an effective tool for outlier identification and removal in QSAR/QSPR studies.
  • The algorithm enhances the predictive accuracy of QSAR models while maintaining dataset integrity.
  • The 'preservation' feature offers valuable flexibility for specific compound selection in drug discovery applications.