Graph neural networks are promising for phenotypic virtual screening on cancer cell lines
View abstract on PubMed
Summary
This summary is machine-generated.Deep learning models, particularly D-MPNN, show promise for phenotypic virtual screening (PVS) in cancer drug design. Rigorous testing on diverse chemical libraries confirms D-MPNN
Area Of Science
- Computational chemistry
- Drug discovery
- Artificial intelligence in medicine
Background
- Artificial intelligence (AI) is revolutionizing early drug design, with phenotypic virtual screening (PVS) offering new ways to predict cancer cell responses.
- Previous studies questioning deep learning's efficacy in PVS were limited by small datasets, inadequate metrics, and unrealistic chemical diversity.
- Real-world virtual screening libraries present significant chemical diversity, posing a challenge for predictive models.
Purpose Of The Study
- To rigorously evaluate machine learning algorithms for phenotypic virtual screening (PVS) using large, diverse datasets.
- To compare the performance of different algorithms under various validation conditions, including dissimilar-molecule splits.
- To identify the most effective AI approach for predicting compound activity in cancer cell lines.
Main Methods
- Prepared 60 datasets with 30,000-50,000 molecules each, testing growth inhibitory activity on NCI-60 cancer cell lines.
- Evaluated five machine learning algorithms for PVS, conducting approximately 14,440 training runs per algorithm.
- Employed both random and dissimilar-molecule split validation types, primarily using hit rate as the performance metric.
Main Results
- All evaluated models faced greater challenges with test molecules dissimilar to the training data.
- The D-MPNN (Deep-learning Molecular Message Passing Neural Network) algorithm consistently outperformed other methods across both validation types.
- Performance was significantly impacted by the chemical dissimilarity between training and testing molecule sets.
Conclusions
- D-MPNN, a graph-based deep neural network, is the most suitable algorithm for developing predictive models in phenotypic virtual screening.
- Robust evaluation with diverse chemical libraries and appropriate metrics is crucial for assessing AI models in drug discovery.
- Future PVS efforts should consider chemical diversity and employ validation strategies that mimic real-world screening challenges.

