Prescription data and demographics: An explainable machine learning exploration of colorectal cancer risk factors based on data from Danish national registries
- 1SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, 5230 Odense, Denmark.
- 2Center for Clinical Epidemiology, Odense University Hospital, 5230 Odense, Denmark; Research Unit of Clinical Epidemiology, University of Southern Denmark, 5230 Odense, Denmark.
- 3Research Unit of General Practice, Department of Public Health, University of Southern Denmark, 5230 Odense, Denmark.
- 4Center for Regenerative Medication, Odense University Hospital, 5230 Odense, Denmark.
- 0SDU Health Informatics and Technology, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, 5230 Odense, Denmark.
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.
View abstract on PubMed
Summary
This summary is machine-generated.Machine learning models can predict colorectal cancer risk using patient demographics and medication data. While precise, the models need further refinement to improve comprehensive risk identification for clinical use.
Area Of Science
- Computational biology
- Oncology
- Health informatics
Background
- Colorectal cancer remains a significant global health challenge despite advances in treatment and prevention.
- Predictive models are crucial for early detection and personalized risk management.
Purpose Of The Study
- To evaluate machine learning models for predicting colorectal cancer risk.
- To utilize demographic and prescribed drug data for risk prediction.
- To enhance model interpretability using explainable AI techniques.
Main Methods
- Developed and assessed five machine learning algorithms: Logistic Regression, XGBoost, Random Forests, kNN, and Voting Classifier.
- Evaluated predictive performance across multiple time horizons (3, 6, 12, 36 months).
- Employed explainable AI for feature contribution analysis (age, sex, social status, medications).
Main Results
- The Voting Classifier demonstrated high precision (>0.99) in identifying at-risk patients.
- Recall was moderate (~0.6), indicating room for improvement in comprehensive detection.
- Model performance was consistent across different prediction timeframes.
Conclusions
- Machine learning effectively identifies individuals at elevated risk for colorectal cancer.
- Early intervention and personalized strategies are facilitated by these predictive models.
- Further research is necessary before widespread clinical implementation.
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.

