A deep learning model based on the BERT pre-trained model to predict the antiproliferative activity of anti-cancer chemical compounds

  • 0Biosensor Research Centre, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.

|

|

Summary

This summary is machine-generated.

Deep transfer learning models like ChemBERTa show promise in predicting anti-cancer drug activity. This study demonstrates their potential for accelerating drug discovery by accurately classifying molecule efficacy across various cancer cell lines.

Area Of Science

  • Computational Chemistry
  • Drug Discovery
  • Bioinformatics

Background

  • Drug discovery aims to find effective compounds with minimal side effects.
  • Traditional quantitative structure-activity relationship (QSAR) studies face limitations due to high costs, time, and data scarcity.
  • Deep transfer learning models offer a promising alternative for predictive modeling in drug discovery.

Purpose Of The Study

  • To evaluate the performance of a deep transfer learning model (ChemBERTa) in predicting the anti-proliferative activity of small molecules against five cancer cell lines.
  • To assess the model's accuracy on both a large public dataset (PubChem) and a smaller in-house dataset.

Main Methods

  • Utilized a BERT-based deep transfer learning model (ChemBERTa) for predictive analysis.
  • Trained and validated the model on over 3,000 synthesized molecules from PubChem for predicting anti-proliferative activity.
  • Tested the model on an in-house dataset of approximately 25 small molecules per cell line, using IC50 values.

Main Results

  • The ChemBERTa model achieved acceptable accuracy in predicting the activity class for most cancer cell lines (HeLa, MCF7, MDA-MB231) on the PubChem dataset.
  • Performance was less reliable for PC3 and MDA-MB cell lines on the initial dataset.
  • On the in-house dataset, the model showed high accuracy for HeLa and acceptable performance for MCF7 and MDA-MB231, but less reliable results for PC3 and HepG2.

Conclusions

  • The fine-tuned ChemBERTa model demonstrates potential as a valuable tool for predicting drug efficacy.
  • The model shows promise for application in drug discovery, particularly for predicting outcomes on novel, in-house datasets.
  • Further refinement may be needed to improve prediction reliability for certain cell lines like PC3 and HepG2.

Related Concept Videos

Targeted Cancer Therapies 02:57

7.5K

The targeted cancer therapies, also known as “molecular targeted therapies,” take advantage of the molecular and genetic differences between the cancer cells and the normal cells. It needs a thorough understanding of the cancer cells to develop drugs that can target specific molecular aspects that drive the growth, progression, and spread of cancer cells without affecting the growth and survival of other normal cells in the body.
There are several types of targeted therapies against...

Combination Therapies and Personalized Medicine 02:50

4.9K

Combining two or more treatment methods increases the life span of cancer patients while reducing damage to vital organs or tissue from the overuse of a single treatment. Combination therapy also targets different cancer-inducing pathways, thus reducing the chances of developing resistance to treatment.
The combination of the drug acetazolamide and sulforaphane is a good example of combination therapy to treat cancer. The cells in the interior of a large tumor often die due to the hypoxic and...

Cancer Survival Analysis 01:21

328

Cancer survival analysis focuses on quantifying and interpreting the time from a key starting point, such as diagnosis or the initiation of treatment, to a specific endpoint, such as remission or death. This analysis provides critical insights into treatment effectiveness and factors that influence patient outcomes, helping to shape clinical decisions and guide prognostic evaluations. A cornerstone of oncology research, survival analysis tackles the challenges of skewed, non-normally...