Large language model produces high accurate diagnosis of cancer from end-motif profiles of cell-free DNA
View abstract on PubMed
Summary
This summary is machine-generated.We developed iLLMAC, an AI model using instruction-tuned large language models (LLMs), to detect cancer from cell-free DNA (cfDNA) end-motif profiles. This novel approach shows high accuracy in diagnosing various cancers, including hepatocellular carcinoma.
Area Of Science
- Artificial Intelligence in Oncology
- Genomics and Bioinformatics
- Liquid Biopsy for Cancer Detection
Background
- Instruction-tuned large language models (LLMs) show promise in aligning with human intentions.
- Accurate and early cancer detection remains a critical challenge in oncology.
- Cell-free deoxyribonucleic acid (cfDNA) end-motif profiling offers a potential non-invasive biomarker for cancer.
Purpose Of The Study
- To develop and evaluate an LLM-based model for cancer detection using cfDNA end-motif profiles.
- To assess the performance of the model, named iLLMAC (LLM-based instruction-tuned LLM for Assessment of Cancer), across different datasets and cancer types.
- To investigate the impact of the number of cfDNA end-motifs on diagnostic accuracy.
Main Methods
- Developed iLLMAC, an instruction-tuned LLM, trained on plasma cfDNA sequencing data from 1135 cancer patients and 1106 controls.
- Utilized cfDNA end-motif profiles as input features for cancer detection.
- Evaluated model performance using area under the receiver operating curve (AUROC) on training and external validation datasets.
Main Results
- iLLMAC achieved an AUROC of 0.866 for general cancer diagnosis and 0.924 for hepatocellular carcinoma (HCC) detection using 16 end-motifs.
- Performance improved with more motifs, reaching AUROCs of 0.886 for cancer diagnosis and 0.956 for HCC detection with 64 end-motifs.
- On an external test set, iLLMAC achieved AUROCs of 0.912 for cancer diagnosis and 0.938 for HCC detection, outperforming benchmark methods.
Conclusions
- LLM-based instruction-tuning is effective for developing accurate cfDNA-based cancer detection models.
- iLLMAC demonstrates significant potential for non-invasive cancer diagnosis, particularly for HCC.
- The study highlights the utility of cfDNA end-motif profiles and advanced AI for advancing cancer diagnostics.

