Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Stage-specific polysomnographic and MRI markers across phenoconversion in isolated REM sleep behavior disorder.

Journal of neurology·2026
Same author

Cross-species transcriptomic analysis of rodent model fidelity to human mesial temporal lobe epilepsy.

Nature communications·2026
Same author

Association between deep learning-based atrial fibrillation burden and in-hospital mortality.

PLOS digital health·2026
Same author

Artificial Intelligence-Based Electrocardiogram Model as a Predictor of Postoperative Atrial Fibrillation Following Cardiac Surgery: Retrospective Cohort Study.

Journal of medical Internet research·2025
Same author

Trends in the Burden of Headache Disorders in Europe, 1990-2021: A Systematic Analysis from the Global Burden of Disease Study 2021.

Journal of clinical medicine·2025
Same author

A New Staining Method Using Methionyl-tRNA Synthetase 1 Antibody for Endoscopic Ultrasound-Guided Fine-Needle Aspiration Cytology of Pancreatic Cancer.

Diagnostics (Basel, Switzerland)·2025
Same journal

Selecting, Scaling, and Measuring the Value of Ambient AI in a Nonacademic Health System: Multiphase Pilot Study.

JMIR medical informatics·2026
Same journal

Prediction of Early Hospital Admission (≤24 Hours) After Stroke Using Machine Learning and Deep Learning: Multicenter Study From China.

JMIR medical informatics·2026
Same journal

Assessing the Feasibility and Acceptability of Implementing a Preclinic Vital Signs Assessment in Primary Care: Cross-Sectional Pilot Study.

JMIR medical informatics·2026
Same journal

Candidate Passive Sensor Suite Technologies for Tactical Combat Casualty Care Environments: Comparative Assessment Study.

JMIR medical informatics·2026
Same journal

Relevance of the uMap Collaborative Platform as Support for Choropleth Mapping: A Traffic‒Light Statistical Signal Atlas of All-Cause Mortality-First French Lockdown.

JMIR medical informatics·2026
Same journal

Ambient AI Scribe Implementation in an Ambulatory Setting in a Single Medical Group: Prospective Study.

JMIR medical informatics·2026
See all related articles

Related Experiment Video

Updated: Jun 6, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Improving Radiology Report Error Detection Using a Multipass Large Language Model: Framework Development and

Songsoo Kim1, Seungtae Lee2, See Young Lee3

  • 1Department of Radiology, Seoul National University Hospital, Seoul, Republic of Korea.

JMIR Medical Informatics
|June 4, 2026
PubMed
Summary
This summary is machine-generated.

An optimized multipass large language model (LLM) framework significantly improved precision and cost-efficiency for radiology report error detection. This AI-radiologist collaboration offers a scalable solution for quality assurance in radiology.

Keywords:
error detectionhuman-in-the-looplarge language modelsquality assuranceradiology report

Related Experiment Videos

Last Updated: Jun 6, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

  • Artificial Intelligence in Medical Imaging
  • Radiology Quality Assurance
  • Natural Language Processing in Healthcare

Background:

  • Large language models (LLMs) for radiology report proofreading often produce numerous false positives (FPs) due to the low error rates in clinical data.
  • This limitation hinders the practical application of LLMs for automated quality assurance in radiology.

Purpose of the Study:

  • To evaluate an optimized LLM framework designed to enhance precision and cost-efficiency in detecting errors within radiology reports.
  • To determine if the proposed framework could maintain or improve error detection capabilities while reducing false positives.

Main Methods:

  • A retrospective analysis of 1000 radiology reports across various modalities (radiography, ultrasonography, CT, MRI) from the MIMIC-III database.
  • Evaluation of three LLM frameworks: single-prompt detector, report extractor plus single-prompt detector, and a multipass framework with an FP verifier.
  • Assessment of precision using positive predictive value (PPV) and error detection rates, alongside estimation of model inference and reviewer labor costs.

Main Results:

  • The multipass LLM framework (framework 3) demonstrated a significant increase in PPV (0.159) compared to single-prompt frameworks (0.063-0.079).
  • Human review burden was reduced by over 50% (from 192 to 88 reports per 1000), and model inference costs decreased by up to 42.6%.
  • Remaining FPs were primarily associated with complex clinical context, indicating a shift from structural errors to nuanced discrepancies.

Conclusions:

  • The optimized multipass LLM framework effectively improves precision and cost-efficiency for radiology report error detection in low-prevalence settings.
  • This approach facilitates a synergistic AI-radiologist collaboration, offering a scalable and cost-effective solution for AI-assisted quality assurance in radiology.
  • The framework enables a targeted human-in-the-loop workflow by filtering out simple errors, allowing human reviewers to focus on complex cases.