Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Unsupervised forward selection: a method for eliminating redundant variables.

D C Whitley1, M G Ford, D J Livingstone

  • 1Centre for Molecular Design, Institute of Biomedical and Biomolecular Science, University of Portsmouth, UK. david.whitley@port.ac.uk

Journal of Chemical Information and Computer Sciences
|October 25, 2000
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Quantitative structure-property relationships for predicting sorption of pharmaceuticals to sewage sludge during waste water treatment processes.

The Science of the total environment·2016
Same author

Computational techniques for the prediction of toxicity.

Toxicology in vitro : an international journal published in association with BIBRA·2010
Same author

Vicinity analysis: a methodology for the identification of similar protein active sites.

Journal of molecular modeling·2008
Same author

Prediction of drug solubility from molecular structure using a drug-like training set.

SAR and QSAR in environmental research·2008
Same author

QSAR studies using the parashift system.

SAR and QSAR in environmental research·2008
Same author

Variable selection and specification of robust QSAR models from multicollinear data: arylpiperazinyl derivatives with affinity and selectivity for alpha2-adrenoceptors.

Journal of computer-aided molecular design·2005
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
See all related articles

This study introduces an unsupervised learning method for effective variable selection in QSAR modeling. The approach yields relevant, non-redundant, and non-collinear descriptors, creating robust and interpretable models.

Area of Science:

  • Computational chemistry
  • Cheminformatics
  • Machine learning

Background:

  • Quantitative Structure-Activity Relationship (QSAR) studies often involve high-dimensional datasets with numerous descriptors.
  • Variable selection is crucial for building robust and interpretable QSAR models, reducing noise and potential overfitting.
  • Existing methods may struggle with redundancy and multicollinearity among selected descriptors.

Purpose of the Study:

  • To propose and evaluate an unsupervised learning method for variable selection in QSAR.
  • To identify a subset of descriptors that are relevant, non-redundant, and minimally collinear.
  • To develop simple, robust, and easily interpretable QSAR models using the selected variables.

Main Methods:

  • An unsupervised learning algorithm was developed for descriptor selection.

Related Experiment Videos

  • Continuum regression, which integrates Ordinary Least Squares (OLS), Principal Component Regression (PCR), and Partial Least Squares Regression (PLS), was employed.
  • The method was tested on three standard QSAR datasets.
  • Main Results:

    • The unsupervised method successfully identified relevant descriptors.
    • Redundancy and multicollinearity were significantly reduced in the selected variable subsets.
    • QSAR models built with the selected variables demonstrated simplicity, robustness, and interpretability.

    Conclusions:

    • The proposed unsupervised variable selection method is effective for QSAR data analysis.
    • This approach enhances model interpretability and robustness by addressing descriptor relevance and intercorrelation.
    • The integration with continuum regression provides a powerful framework for QSAR model development.