Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

SMUPhantom: a 3D-printable modular CT perfusion phantom for quantitative evaluation of tissue-mimicking dynamic contrast behavior.

Biomedical physics & engineering express·2026

Same author

Hierarchical dynamic model for risk-stratified screening of nasopharyngeal carcinoma.

Nature communications·2026

Same author

Deep Learning Model Based on Tumor and Visceral Adipose Tissue CT Features for Predicting Peritoneal Metastasis Risk after Radical Gastrectomy in Serosa-Invasive Gastric Cancer.

Radiology. Imaging cancer·2026

Same author

Case report of a laparoscopic biopsy found to trigger spontaneous tumor lysis syndrome in pediatric Burkitt lymphoma.

AME case reports·2026

Same author

Screening Candidates for Conversion Therapy in Unresectable Hepatocellular Carcinoma Patients After Tyrosine Kinase Inhibitor Plus PD-1/PD-L1 Antibody Therapy: A Multicenter Retrospective Study.

Journal of hepatocellular carcinoma·2025

Same author

Collaborative and privacy-preserving cross-vendor united diagnostic imaging via server-rotating federated machine learning.

Communications engineering·2025

Same journal

Correction: Call for Decision Support for Electrocardiographic Alarm Administration Among Neonatal Intensive Care Unit Staff: Multicenter, Cross-Sectional Survey.

Journal of medical Internet research·2026

Same journal

A Futures Framework for Clinical AI Governance: Anticipating Emerging Risks, Shifting Roles, and Regulatory Challenges.

Journal of medical Internet research·2026

Same journal

Using a Large Language Model to Support Thematic Analysis of Patient Experiences in Chronic Illness Management: Comparative Qualitative Study.

Journal of medical Internet research·2026

Same journal

Combined Internet-Based Cognitive Behavioral Therapy and Face-to-Face Physiotherapy in Primary Health Care for Chronic Widespread Pain: Randomized Controlled Trial.

Journal of medical Internet research·2026

Same journal

Operationalizing Digital Health Equity in Artificial Intelligence-Enabled Patient Decision Aids for Older Adults: Mixed Methods Study.

Journal of medical Internet research·2026

Same journal

Automated Prediction of Glasgow Coma Scale Scores From Unstructured Electronic Health Records Using Natural Language Processing: Development and Validation Study.

Journal of medical Internet research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 16, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Error Detection in Emergency Radiology Reports Using a Large Language Model: Multistage Evaluation Study.

Hui Shen¹, Tianyang Wu², Fei Wang¹

¹Department of Radiology, The First Affiliated Hospital of Jinan University, No. 613 Huangpu West Road, Tianhe, Guangzhou, Guangdong, 510630, China, 86 15217921427.

Journal of Medical Internet Research

|April 14, 2026

Summary

This summary is machine-generated.

A domain-optimized large language model, DeepSeek-R1, effectively identified errors in Chinese emergency radiology reports. This AI tool shows promise for enhancing quality control and assisting radiologists in high-pressure clinical settings.

Keywords:

emergency radiology error detection large language models quality control

More Related Videos

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Related Experiment Videos

Last Updated: Apr 16, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Area of Science:

Artificial Intelligence in Medicine
Medical Imaging and Diagnostics
Radiology Informatics

Background:

Emergency radiology faces challenges with increasing workloads and the risk of reporting errors.
The efficacy of large language models (LLMs) in identifying errors in emergency radiology, particularly in non-English contexts, is not well-established.
Accurate and timely reporting is critical in emergency radiology.

Purpose of the Study:

To evaluate the performance of a domain-optimized LLM, DeepSeek-R1, in detecting errors within Chinese emergency radiology reports.
To compare the error detection capabilities of DeepSeek-R1 against board-certified radiologists.
To assess the potential of DeepSeek-R1 as an assistive tool for quality control in emergency radiology.

Main Methods:

A dataset of 7435 Chinese emergency radiology reports was compiled.
Five LLMs were initially screened, with DeepSeek-R1 selected for further evaluation using 0-shot and few-shot learning techniques.
Model performance was benchmarked against 12 radiologists and validated on real-world reports.

Main Results:

DeepSeek-R1 demonstrated a higher error detection rate (84.4%) in the few-shot setting compared to the 0-shot setting (60.9%).
The LLM outperformed radiology residents and showed comparable performance to senior and attending radiologists.
DeepSeek-R1 identified critical omissions and other errors more effectively than residents and operated with greater efficiency.

Conclusions:

The domain-optimized LLM, DeepSeek-R1, shows significant potential for improving the quality control of emergency radiology reports.
Its performance and efficiency suggest its utility as a valuable assistive proofreading tool in clinical radiology workflows.
DeepSeek-R1 can aid in reducing errors and enhancing the accuracy of diagnostic reporting under time constraints.