Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

QSAR with few compounds and many features.

D M Hawkins1, S C Basak, X Shi

  • 1School of Statistics, 313 Ford Hall, 224 Church Street S. E., University of Minnesota, Minneapolis, Minnesota 55455, USA. sbasak@nrri.umn.edu

Journal of Chemical Information and Computer Sciences
|June 21, 2001
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Chemometrical analysis of proteomics data obtained from three cell types treated with multi-walled carbon nanotubes and TiO<sub>2</sub> nanobelts<sup>$</sup>.

SAR and QSAR in environmental research·2018
Same author

Mathematical structural descriptors and mutagenicity assessment: a study with congeneric and diverse datasets<sup>$</sup>.

SAR and QSAR in environmental research·2018
Same author

Evaluation of applying statistical process control techniques to daily average feeding behaviors to detect disease in automatically fed group-housed preweaned dairy calves.

Journal of dairy science·2018
Same author

Contraception in sea-going service personnel.

Journal of the Royal Naval Medical Service·2015
Same author

Fluorescence spectroscopy incorporated in an Optical Biopsy System for the detection of early neoplasia in Barrett's esophagus.

Diseases of the esophagus : official journal of the International Society for Diseases of the Esophagus·2014
Same author

Exploring knowledge and perceptions of generic medicines among drug retailers and community pharmacists.

Indian journal of pharmaceutical sciences·2013
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
Same journal

Future Papers.

Journal of chemical information and computer sciences·2016
See all related articles

This study introduces an enhanced ridge regression (RR) method for quantitative structure-activity relationship (QSAR) modeling, particularly for datasets with many features and few compounds. The approach uses generalized cross-validation for improved predictions and diagnostics in QSAR analysis.

Area of Science:

  • * Cheminformatics
  • * Computational Chemistry
  • * Statistical Modeling

Background:

  • * Quantitative Structure-Activity Relationship (QSAR) modeling relies on statistical methods sensitive to data matrix characteristics.
  • * Traditional QSAR struggles with datasets featuring numerous descriptors and limited compounds, where feature selection is often unrealistic.
  • * Methods like Principal Component Regression (PCR) and Partial Least Squares (PLS) offer alternatives by avoiding feature selection but typically assume linearity.

Purpose of the Study:

  • * To develop and present an advanced ridge regression (RR) methodology tailored for underdetermined QSAR modeling scenarios (many features, few compounds).
  • * To integrate generalized cross-validation (GCV) for optimal ridge constant selection and F-tests for assessing additional information.
  • * To enable the use of conventional regression diagnostics for identifying nonlinearities and model deviations.

Related Experiment Videos

Main Methods:

  • * Development of a modified ridge regression (RR) approach for underdetermined data matrices.
  • * Application of generalized cross-validation (GCV) to determine the optimal ridge constant.
  • * Utilization of F-tests to evaluate the significance of additional predictive information.
  • * Employing standard regression diagnostics for post-model analysis.

Main Results:

  • * The enhanced RR method provides a viable approach for QSAR modeling when dealing with a high number of features relative to compounds.
  • * Generalized cross-validation effectively selects the ridge constant, leading to robust model performance.
  • * The methodology allows for the detection of nonlinearities and other model assumptions violations through follow-up diagnostics.

Conclusions:

  • * The presented ridge regression development offers a powerful statistical tool for QSAR analysis in challenging data scenarios.
  • * This method enhances predictive accuracy and interpretability for complex molecular datasets.
  • * The approach facilitates a more comprehensive understanding of structure-activity relationships through advanced statistical modeling.