Symptom-based drug prediction of lifestyle-related chronic diseases using unsupervised machine learning techniques

  • 0Department of Computer Science and Engineering, University of Calcutta, JD-2, Sector-III, Salt Lake, Kolkata, 700098, India.

|

|

Summary

This summary is machine-generated.

This study developed an unsupervised machine learning tool to predict drugs for lifestyle-related diseases (LSDs) based on symptoms. The web application aids clinicians in early treatment decisions for heart and lung conditions.

Area Of Science

  • Computational biology
  • Bioinformatics
  • Machine learning in healthcare

Background

  • Lifestyle-related diseases (LSDs) present a significant economic and health burden, often affecting the heart and lungs.
  • Early treatment of LSDs is crucial, and symptom-based information is the primary data available to clinicians.
  • Developing predictive models for drug discovery in LSDs is essential for timely therapeutic interventions.

Purpose Of The Study

  • To apply unsupervised machine learning (ML) techniques for predicting drugs from symptoms in lifestyle-related diseases (LSDs).
  • To focus specifically on developing predictive models for pulmonary and heart diseases, a subset of LSDs.
  • To create a user-friendly web application for symptom-based drug prediction.

Main Methods

  • Utilized drug-disease, disease-symptom associations for 143 LSDs, 1271 drugs, and 305 symptoms to compute direct drug-symptom associations.
  • Developed and compared four ML clustering algorithms (K-Means, Bisecting K-Means, Mean Shift, BIRCH) to group drugs based on symptom features.
  • Saved the optimal ML model and developed a web application for drug prediction from input symptoms.

Main Results

  • The Bisecting K-means model demonstrated superior performance, achieving a silhouette coefficient of 0.647 and generating 138 distinct drug clusters.
  • Drugs within identified clusters exhibited significant similarity based on gene ontology, chemical ontology, and maximum common substructure analyses.
  • The web application provides a confidence score for predicted drugs, enhancing its utility for clinical decision support.

Conclusions

  • Direct drug-symptom associations were computed and leveraged to create a novel, unsupervised ML-based tool for predicting drugs for LSDs.
  • This ML-driven prediction tool can serve as a valuable second opinion for clinicians, facilitating earlier treatment initiation for LSD patients.
  • A publicly accessible web application (http://bicresources.jcbose.ac.in/ssaha4/sdldpred) offers a simple interface for end-users to utilize the ML-based drug prediction tool.

Related Concept Videos

Statistical Methods for Analyzing Epidemiological Data 01:25

364

Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:

Descriptive Statistics: These provide basic...

Assessment of the Cardiovascular System I: Subjective Data 01:23

323

A thorough health history and physical assessment are essential for identifying cardiovascular disease (CVD) symptoms and distinguishing them from other health issues.
Initial Enquiry
Ask the patient about their primary concern and thoroughly explore all reported symptoms.
Medical History
Investigate past illnesses affecting the cardiovascular system, such as angina, anemia, rheumatic fever, congenital heart disease, stroke, thrombophlebitis, dysrhythmias, varicosities
Inquire about symptoms...

Genome-wide Association Studies-GWAS 01:11

13.4K

Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...

Mechanistic Models: Compartment Models in Individual and Population Analysis 01:23

39

Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...

Observational Studies 01:11

8.5K

Observational studies are a type of analytical study where researchers observe events without any interventions. In other words, the researcher does not influence the response variable or the experiment's outcome.
There are three types of observational studies – Prospective, retrospective, and cross-sectional.
Prospective Study
Prospective studies, also known as longitudinal or cohort studies, are carried out by collecting future data from groups sharing similar characteristics. One...