Pan-cancer predictive survival model development and evaluation using electronic health record and genetic data across 10 cancer types
- Jurgita Gammall 1,2, Alvina G Lai 3
- Jurgita Gammall 1,2, Alvina G Lai 3
- 1Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA, UK. jurgita.gammall.20@ucl.ac.uk.
- 2Oracle Global Services Limited, London, UK. jurgita.gammall.20@ucl.ac.uk.
- 3Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA, UK. alvinagracelai@gmail.com.
- 0Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA, UK. jurgita.gammall.20@ucl.ac.uk.
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.
View abstract on PubMed
Summary
This summary is machine-generated.This study developed machine learning models for cancer survival prediction using patient data and genetic information. Models showed good performance, with genetic data improving predictions for some cancers, aiding treatment decisions.
Area Of Science
- Oncology
- Bioinformatics
- Machine Learning
Background
- Rising cancer incidence necessitates advanced analytical approaches.
- Large-scale healthcare data availability offers opportunities for improved cancer analysis.
- Accurate cancer prognosis is crucial for effective patient management and treatment.
Purpose Of The Study
- To develop and evaluate prognostic cancer survival models for ten common cancer types.
- To compare the performance of various machine learning algorithms in cancer prognosis.
- To assess the added value of genetic information and improve model explainability for clinical adoption.
Main Methods
- Utilized data from 9977 cancer patients across ten cancer types.
- Integrated genetic data (100,000 Genomes Project) with clinical, demographic, and hospital data.
- Developed and compared four machine learning algorithms: Elastic Net Cox, random survival forest, gradient boosting survival, and DeepSurv.
Main Results
- Models achieved C-indices ranging from 60% (bladder cancer) to 80% (glioma), averaging 72%.
- Machine learning algorithms performed similarly, with DeepSurv slightly underperforming.
- Genetic data enhanced model performance for endometrial, glioma, ovarian, and prostate cancers.
Conclusions
- Developed robust machine learning models for cancer survival prediction.
- Demonstrated the utility of integrating genetic data into prognostic models.
- Identified key prognostic features including age, stage, TP53 mutations, and tumour mutational burden.
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.

