Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipeline

  • 1The First Affiliated Hospital of Guangxi University of Chinese Medicine, Nanning, 530023, China.
  • 2Institute of Biointellgence Technology, BGI Research, Wuhan, 430074, China.
  • 3The First Affiliated Hospital of Guangxi University of Chinese Medicine, Nanning, 530023, China. 6128239@qq.com.
  • 4Institute of Biointellgence Technology, BGI Research, Wuhan, 430074, China. wanglei12@genomics.cn.
  • 5Guangdong Bigdata Engineering Technology Research Center for Life Sciences, BGI Research, Shenzhen, 518083, China. wanglei12@genomics.cn.

|

Abstract

BACKGROUND

Recruitment for cohorts involving complex liver diseases, such as hepatocellular carcinoma and liver cirrhosis, often requires interpreting semantically complex criteria. Traditional manual screening methods are time-consuming and prone to errors. While AI-powered pre-screening offers potential solutions, challenges remain regarding accuracy, efficiency, and data privacy.

METHODS

We developed a novel patient pre-screening pipeline that leverages clinical expertise to guide the precise, safe, and efficient application of large language models. The pipeline breaks down complex criteria into a series of composite questions and then employs two strategies to perform semantic question-answering through electronic health records: (1) Pathway A, Anthropomorphized Experts' Chain of Thought strategy; and (2) Pathway B, Preset Stances within an Agent Collaboration strategy, particularly in managing complex clinical reasoning scenarios. The pipeline is evaluated on key metrics including precision, recall, time consumption, and counterfactual inference-at both the question and criterion levels.

RESULTS

Our pipeline achieved a notable balance of high precision (e.g., 0.921, criteria level) and good overall recall (e.g., ~ 0.82, criteria level), alongside high efficiency (0.44s per task). Pathway B excelled in high-precision complex reasoning (while exhibiting a specific recall profile conducive to accuracy), whereas Pathway A was particularly effective for tasks requiring both robust precision and recall (e.g., direct data extraction), often with faster processing times. Both pathways achieved comparable overall precision while offering different strengths in the precision-recall trade-off. The pipeline showed promising precision-focused results in hepatocellular carcinoma (0.878) and cirrhosis trials (0.843).

CONCLUSIONS

This data-secure and time-efficient pipeline shows high precision and achieves good recall in hepatopathy trials, providing promising solutions for streamlining clinical trial workflows. Its efficiency, adaptability, and balanced performance profile make it suitable for improving patient recruitment. And its capability to function in resource-constrained environments further enhances its utility in clinical settings.

Related Concept Videos