Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Ensemble feature selection: consistent descriptor subsets for multiple QSAR models.

Debojyoti Dutta1, Rajarshi Guha, David Wild

  • 1School of Informatics, Indiana University, Bloomington, Indiana 47406, USA.

Journal of Chemical Information and Modeling
|April 5, 2007
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Developing Predictive Models by Sharing Predictions - An Investigation of a Federated Learning Approach for ADMET Predictions.

Journal of medicinal chemistry·2026
Same author

Paths to cheminformatics: Q&A with Rajarshi Guha.

Journal of cheminformatics·2026
Same author

3D correlative light and electron microscopy reveals the uptake and processing of inorganic-organic hybrid nanoparticles into cancer cells.

Nanomedicine : nanotechnology, biology, and medicine·2025
Same author

Enhanced transport behavior of small molecules in polymer solutions.

Soft matter·2025
Same author

Genome-wide high-throughput transposon mutagenesis unveils key factors for acidic pH adaptation of <i>Corynebacterium diphtheriae</i>.

Microbiology (Reading, England)·2025
Same author

Nonbonded Molecular Interaction Controls Aggregation Kinetics of Hydrophobic Molecules in Water.

Langmuir : the ACS journal of surfaces and colloids·2025
Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026
Same journal

DeepKbhb: Context-Aware Prediction of Human Lysine β-Hydroxybutyrylation Sites.

Journal of chemical information and modeling·2026
Same journal

HyperDC: A Non-Uniform Hypergraph Framework for Dual- and Higher-Order Drug Combination Recommendation Across Diverse Complex Diseases.

Journal of chemical information and modeling·2026
Same journal

Correction to "AstraMEV (AI-Guided Structural Assembly of Multi-Epitope Vaccines) Against Infectious Bronchitis Virus".

Journal of chemical information and modeling·2026
Same journal

MolPy: A Large Language Model-Friendly Toolkit for Reactive Topology Editing in Polymer Simulations.

Journal of chemical information and modeling·2026
Same journal

Molecular Mechanisms of KIT Receptor Dimerization and Oncogenic Activation Revealed by Multiscale Simulations.

Journal of chemical information and modeling·2026
See all related articles

Selecting a single descriptor subset for multiple quantitative structure-activity relationship (QSAR) models improves interpretability without sacrificing predictive performance. This approach aids in understanding structure-activity relationships across different model types.

Area of Science:

  • * Cheminformatics
  • * Computational Chemistry
  • * Machine Learning

Background:

  • * Descriptor selection is crucial for building predictive quantitative structure-activity relationship (QSAR) models.
  • * Traditional methods often yield different descriptor subsets for each model type, hindering interpretability.
  • * Ensemble modeling typically uses diverse descriptor sets, complicating comparative analysis.

Purpose of the Study:

  • * To develop a method for selecting a single, optimal descriptor subset applicable to multiple QSAR model types.
  • * To evaluate the impact of using a common descriptor set on model interpretability and predictive performance.
  • * To demonstrate the utility of this approach across different datasets and modeling tasks.

Main Methods:

  • * Proposed a novel approach for selecting a unified descriptor subset.

Related Experiment Videos

  • * Applied the method to three distinct datasets, encompassing both regression and classification tasks.
  • * Developed multiple QSAR models (e.g., linear regression, neural networks) using the selected common descriptor set.
  • Main Results:

    • * The unified descriptor subset approach did not significantly compromise the predictive ability of individual QSAR models.
    • * Models developed using the same descriptor set demonstrated comparable performance to traditional methods.
    • * Interpretation of models revealed consistent structure-activity trends across different model types.

    Conclusions:

    • * A single descriptor subset can be effectively used for multiple QSAR model types without substantial loss in predictive power.
    • * This unified approach enhances the interpretability of QSAR models by revealing common structure-activity relationships.
    • * The method offers a valuable strategy for developing more transparent and comparable QSAR models.