Automated structured data extraction from intraoperative echocardiography reports using large language models
View abstract on PubMed
Summary
This summary is machine-generated.Consensus-based large language model (LLM) ensembles can automate structured data extraction from echocardiography reports. The unanimous LLM ensemble demonstrated high accuracy and low error rates in analyzing intraoperative transesophageal reports.
Area Of Science
- Artificial Intelligence in Medicine
- Natural Language Processing in Healthcare
- Cardiovascular Imaging Informatics
Background
- Echocardiography reports contain valuable data in unstructured text format.
- Automated structured data extraction is needed to improve efficiency and data utilization.
- Large Language Models (LLMs) show promise for this task.
Purpose Of The Study
- To evaluate the effectiveness of consensus-based LLM ensembles for extracting structured echocardiographic data.
- To compare different LLM ensemble voting strategies (unanimous, supermajority, majority, plurality).
- To assess accuracy, error rates, and data yield in intraoperative transesophageal reports.
Main Methods
- A cross-sectional study used 600 intraoperative transesophageal reports.
- Three key echocardiographic parameters were extracted: LVEF, RV systolic function, and TR.
- Five open-source LLMs and four voting strategies were employed to create ensembles.
Main Results
- The unanimous LLM ensemble achieved the highest consensus accuracy (99.4% presurgical, 97.9% postsurgical) and lowest error rates.
- The plurality LLM ensemble yielded the highest raw accuracy (96.1% presurgical, 93.7% postsurgical) and data extraction yield.
- Performance varied significantly across different voting strategies and report sections.
Conclusions
- Consensus-based LLM ensembles can successfully generate structured data from unstructured echocardiography reports.
- The choice of voting strategy impacts the trade-off between accuracy, yield, and error rates.
- LLM ensembles offer a viable automated solution for echocardiographic data extraction.

