Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Contemporary QSAR classifiers compared.

Craig L Bruce1, James L Melville, Stephen D Pickett

  • 1School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, UK.

Journal of Chemical Information and Modeling
|January 24, 2007
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Balancing optimism versus potential risks of AI-driven drug discovery.

Expert opinion on drug discovery·2026
Same author

PyMolGen: Database-Driven Molecular Generation of Drug-Like Compounds.

Journal of chemical information and modeling·2026
Same author

Proposed Biosynthesis of the Complex Ring-Fused Diterpene Rameswaralide. Mechanistic Insights Using Density Functional Theory.

The Journal of organic chemistry·2026
Same author

Rega: A Platform for the Prediction of the Regioselectivity of C-H Functionalization Reactions.

Journal of chemical information and modeling·2026
Same author

Automated Molecular Design in BRADSHAW, Applied to the Optimization of ERAP1 Inhibitors.

Journal of medicinal chemistry·2026
Same author

Query Matters: How Selection Strategies Influence Active Learning in Drug Discovery.

Journal of chemical information and modeling·2026
Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026
Same journal

DeepKbhb: Context-Aware Prediction of Human Lysine β-Hydroxybutyrylation Sites.

Journal of chemical information and modeling·2026
Same journal

HyperDC: A Non-Uniform Hypergraph Framework for Dual- and Higher-Order Drug Combination Recommendation Across Diverse Complex Diseases.

Journal of chemical information and modeling·2026
Same journal

Correction to "AstraMEV (AI-Guided Structural Assembly of Multi-Epitope Vaccines) Against Infectious Bronchitis Virus".

Journal of chemical information and modeling·2026
Same journal

MolPy: A Large Language Model-Friendly Toolkit for Reactive Topology Editing in Polymer Simulations.

Journal of chemical information and modeling·2026
Same journal

Molecular Mechanisms of KIT Receptor Dimerization and Oncogenic Activation Revealed by Multiscale Simulations.

Journal of chemical information and modeling·2026
See all related articles

Machine learning algorithms like random forests improve drug data mining predictive performance. Optimal random forest parameters were identified, offering an interpretable alternative to support vector machines.

Area of Science:

  • Computational chemistry
  • Bioinformatics
  • Machine learning in drug discovery

Background:

  • Machine learning (ML) is crucial for analyzing complex drug data.
  • Ensemble methods like boosting, bagging, and random forests are advanced ML techniques.
  • Support vector machines (SVMs) are also widely used in cheminformatics.

Purpose of the Study:

  • To comparatively assess state-of-the-art ML tools for drug data mining.
  • To investigate the performance of ensemble decision tree methods against single trees and SVMs.
  • To identify optimal parameters for random forests for improved predictive accuracy.

Main Methods:

  • Comparative analysis of support vector machines (SVMs), boosting, bagging, and random forest algorithms.
  • Utilized eight diverse datasets and two distinct sets of molecular descriptors.

Related Experiment Videos

  • Rigorous statistical tests, including multiple comparison tests, were employed to validate performance differences.
  • Main Results:

    • Ensemble decision tree methods demonstrated consistent predictive performance improvements over single decision trees.
    • No single algorithm consistently outperformed others across all datasets.
    • Identified a specific set of parameters for random forests that yielded optimal performance across all tested datasets.
    • Random forests offered a more interpretable model structure compared to SVMs.

    Conclusions:

    • Ensemble ML methods, particularly random forests, offer significant advantages in drug data mining.
    • Optimized random forest models provide a powerful and interpretable tool for predictive tasks in drug discovery.
    • The interpretability of random forests presents a key advantage over black-box models like SVMs.