Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

301
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
301
Numerical Calculations01:24

Numerical Calculations

1.1K
In engineering applications, the representation of the numerical value is critical. Presenting or reporting the answer is one of the essential parts of engineering practices. Numerical calculations are performed using handheld calculators or computers since numerically accurate answers are always preferred.
The solution to a problem is obtained using different methods. While manually solving algebraic symbols is one of the most common methods, the graphical method is often preferred. Computers...
1.1K
Language01:16

Language

894
Language is a unique communication system that uses words and systematic rules to organize and transmit information. Unlike other forms of communication, which may involve postures, movements, odors, or vocalizations, language relies on symbols and grammar. This makes human communication distinct from that of other species, who also communicate but do not use language in the same way humans do.
Corballis and Suddendorf (2007) and Tomasello and Rakoczy (2003) highlight the role of language in...
894
Pilot and Numeric Relaying01:21

Pilot and Numeric Relaying

483
Pilot relaying is a type of differential protection used in power systems. It compares electrical quantities at the terminals of equipment via a communication channel instead of direct relay interconnection. This method is essential for transmission lines where the terminals are far apart, typically up to 80 km for lines with 69 to 115 kV ratings. Four types of communication channels are used for pilot relaying:
483
Systematic Error: Methodological and Sampling Errors01:15

Systematic Error: Methodological and Sampling Errors

9.9K
In the case of systematic errors, the sources can be identified, and the errors can be subsequently minimized by addressing these sources. According to the source, systematic errors can be divided into sampling, instrumental, methodological, and personal errors.
Sampling errors originate from improper sampling methods or the wrong sample population. These errors can be minimized by refining the sampling strategy. Defective instruments or faulty calibrations are the sources of instrumental...
9.9K
Fundamental Attribution Error01:14

Fundamental Attribution Error

13.7K
According to some social psychologists, people tend to overemphasize internal factors as explanations—or attributions—for the behavior of other people. They tend to assume that the behavior of another person is a trait of that person, and to underestimate the power of the situation on the behavior of others. They tend to fail to recognize when the behavior of another is due to situational variables, and thus to the person’s state. This erroneous assumption is...
13.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Development and Internal Validation of a Side-Specific Nomogram Integrating mpMRI and Biopsy Features to Guide Nerve-Sparing Decision Making in Prostate Cancer with Capsular Contact.

Cancers·2026
Same author

Imaging the Breast Cancer Microenvironment: Toward Interpretable MRI Biomarkers for Treatment Response.

Radiology. Artificial intelligence·2026
Same author

Pre-Imaging Clinical Factors Associated With Cardiac MR Image Quality Using Large Language Model-Enabled Data Extraction.

Journal of magnetic resonance imaging : JMRI·2026
Same author

Shear Wave Elastography for Characterization of Breast Lesions in Clinical Routine.

Journal of ultrasound in medicine : official journal of the American Institute of Ultrasound in Medicine·2026
Same author

Strengthening Exposure to Mental Health and Psychiatry for Medical Undergraduates Through a Combined Well-being and Research Engagement Initiative.

Indian journal of psychological medicine·2025
Same author

Balancing Diagnostic Certainty and Locoregional Recurrence Risk in Stage I Non-Small Cell Lung Cancer.

Radiology·2025
Same journal

Kolmogorov-Arnold Guided Local-Global Attention for Medical Image Classification.

Journal of imaging informatics in medicine·2026
Same journal

Artificial Intelligence-Assisted Inner Ear Computed Tomography Analysis: Radiomics-Based Comparison of Affected and Unaffected Ears in Idiopathic Sudden Sensorineural Hearing Loss.

Journal of imaging informatics in medicine·2026
Same journal

High Adoption, Higher Expectations: A Cross-Sectional Survey of Radiologist Engagement with Artificial Intelligence in the United Arab Emirates.

Journal of imaging informatics in medicine·2026
Same journal

Complex-valued Multi-scale Hybrid Attention Network for Fast MRI via Sparsified Data Learning.

Journal of imaging informatics in medicine·2026
Same journal

Automatic Phase and Sequence Identification in Gd-EOB-DTPA-Enhanced Liver MRI Using Deep Convolutional and Sequential Learning.

Journal of imaging informatics in medicine·2026
Same journal

Ultrasound-Based AI in Predicting Hormone Receptor Status in Breast Cancer: Is "Digital Biopsy" Possible.

Journal of imaging informatics in medicine·2026
See all related articles

Related Experiment Video

Updated: Jan 23, 2026

Examining Bilingual Language Control Using the Stroop Task
05:31

Examining Bilingual Language Control Using the Stroop Task

Published on: February 26, 2020

15.5K

Large Language Models in Radiologic Numerical Tasks: A Thorough Evaluation and Error Analysis.

Ali Nowroozi1, Masha Bondarenko1, Adrian Serapio1

  • 1Center for Intelligent Imaging, Department of Radiology and Biomedical Imaging, University of California, San Francisco (UCSF), San Francisco, CA, USA.

Journal of Imaging Informatics in Medicine
|January 21, 2026
PubMed
Summary
This summary is machine-generated.

Large language models (LLMs) were evaluated on radiology numerical tasks. Reinforcement learning (RL) models demonstrated consistent high performance and accuracy, with no mathematical errors found.

Keywords:
Data extractionLarge language modelsMathematicsNumbersRadiology reportsReasoning

More Related Videos

Motor Dual-Tasks for Gait Analysis and Evaluation in Post-Stroke Patients
05:23

Motor Dual-Tasks for Gait Analysis and Evaluation in Post-Stroke Patients

Published on: March 11, 2021

2.8K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K

Related Experiment Videos

Last Updated: Jan 23, 2026

Examining Bilingual Language Control Using the Stroop Task
05:31

Examining Bilingual Language Control Using the Stroop Task

Published on: February 26, 2020

15.5K
Motor Dual-Tasks for Gait Analysis and Evaluation in Post-Stroke Patients
05:23

Motor Dual-Tasks for Gait Analysis and Evaluation in Post-Stroke Patients

Published on: March 11, 2021

2.8K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K

Area of Science:

  • Medical Imaging and Artificial Intelligence
  • Natural Language Processing in Healthcare

Background:

  • Large language models (LLMs) show promise in processing clinical text.
  • Evaluating LLM performance in specific medical domains like radiology is crucial.

Purpose of the Study:

  • To assess the performance of various LLMs on radiology numerical extraction and judgment tasks.
  • To conduct a detailed error analysis of LLM outputs in these tasks.

Main Methods:

  • Six radiology tasks were defined: three extraction (T-score, CBD diameter, lung nodule size) and three judgment (PET hypermetabolism, osteoporosis, CBD dilation).
  • LLMs evaluated included Llama 3.1 8b, DeepSeek R1 distilled Llama 8b, OpenAI o1-mini, and OpenAI GPT-5-mini, using data from MIMIC III and institutional databases.
  • Manual review and error analysis were performed on all incorrect LLM outputs.

Main Results:

  • For extraction tasks, non-RL models (o1-mini, GPT-5-mini) achieved >95% accuracy, while Llama showed variability (86%-98.7%).
  • In judgment tasks, o1-mini and GPT-5-mini achieved accuracies of 91.7% and 99.0% respectively, with 100% accuracy in osteoporosis detection.
  • No mathematical errors were found in o1-mini and GPT-5-mini outputs. Answer-only format negatively impacted Llama and DeepSeek distilled Llama performance.

Conclusions:

  • Reinforcement learning (RL) reasoning LLMs exhibit consistent high performance and accuracy in radiology numerical tasks, with no mathematical errors.
  • Non-RL models can also achieve acceptable performance, depending on the specific task complexity.