Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Methods of Documentation VI: Case Management Model01:15

Methods of Documentation VI: Case Management Model

966
The case management model is a multidisciplinary approach that involves healthcare professionals from diverse disciplines, such as physicians, nurses, therapists, social workers, and pharmacists, working collaboratively to address the various needs of patients. Each healthcare professional brings unique expertise and perspectives, contributing to a more comprehensive understanding of the patient's condition and tailoring treatment plans accordingly.
For example, a patient with a chronic...
966
SBAR II: Application of SBAR01:14

SBAR II: Application of SBAR

6.2K
SBAR is an effective communication tool used by healthcare professionals to communicate patient information accurately. SBAR stands for Situation, Background, Assessment, and Recommendation. For a better understanding, an example is given below.
SBAR Report from a Nurse to a Health Care Provider
S: "Hello, Dr. Smith. This is Jane, RN, from the Med Surg unit. I am calling to tell you about Ms. White in Room 210, who is experiencing increased pain and redness at her incision site. Her recent...
6.2K
Decision Making: Traditional Method01:14

Decision Making: Traditional Method

5.6K
The process of hypothesis testing based on the traditional method includes calculating the critical value, testing the value of the test statistic using the sample data, and interpreting these values.
First, a specific claim about the population parameter is decided based on the research question and is stated in a simple form. Further, an opposing statement to this claim is also stated. These statements can act as null and alternative hypotheses, out of which a null hypothesis would be a...
5.6K
Pharmacokinetic Models: Comparison and Selection Criterion01:26

Pharmacokinetic Models: Comparison and Selection Criterion

392
Physiological and compartmental models are valuable tools used in studying biological systems. These models rely on differential equations to maintain mass balance within the system, ensuring an accurate representation of the dynamic processes at play.
Physiological models take a detailed approach by considering specific molecular processes. They can predict drug distribution, metabolism, and elimination changes, providing a comprehensive understanding of how drugs interact with the body.
392

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Erratum for: Associations of MRI-derived Paraspinal IMAT and LMM with Cardiometabolic Risk Factors: Results from a German Cohort.

Radiology·2026
Same author

clickBrick prompt engineering: optimizing large language model performance in clinical psychiatry.

Npj mental health research·2026
Same author

Gut decisions based on the liver: prediction of colorectal neoplasia using AI-based liver analysis of routine CT scans.

Frontiers in oncology·2026
Same author

Counterfactual Diffusion Models Provide Interpretable Explanations of Artificial Intelligence Models in Pathology.

Cancer research·2026
Same author

Towards autonomous medical artificial intelligence agents.

Nature·2026
Same author

SCRIPT: Stratified clinical risk prediction from pathology reports using large language models.

Journal of pathology informatics·2026
Same journal

Enhancing anatomical recognition by surgeons during pelvic lymph node dissection using artificial intelligence.

NPJ digital medicine·2026
Same journal

AFP assistant: a retrieval-augmented generation and large language model-powered multilingual polio chatbot for low-resource language communities.

NPJ digital medicine·2026
Same journal

Structured reasoning failures compromise LLM interpretation of clinical oncology notes.

NPJ digital medicine·2026
Same journal

Translation of frozen sections into FFPE images for skin cancer resection margins using generative AI.

NPJ digital medicine·2026
Same journal

FedFound: a federated foundation model for lifespan brain morphological connectome analysis.

NPJ digital medicine·2026
Same journal

A multimodal instruction dataset and benchmark for ultrasound understanding.

NPJ digital medicine·2026
See all related articles

Related Experiment Video

Updated: Feb 20, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.2K

Benchmarking large language model-based agent systems for clinical decision tasks.

Yunsong Liu1,2, Zunamys I Carrero2, Xiaofeng Jiang2,3

  • 1Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.

NPJ Digital Medicine
|February 18, 2026
PubMed
Summary
This summary is machine-generated.

Agentic artificial intelligence (AI) systems show limited performance gains in healthcare despite advanced tools. Current systems offer modest benefits with high computational costs, highlighting the need for improved AI solutions.

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.6K

Related Experiment Videos

Last Updated: Feb 20, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.2K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.6K

Area of Science:

  • Artificial Intelligence
  • Medical Informatics
  • Computational Medicine

Background:

  • Agentic AI systems, capable of autonomous reasoning and tool use, show potential in healthcare applications.
  • Systematic real-world performance evaluation of these advanced AI systems in medicine is currently limited.
  • Existing benchmarks do not fully capture the complexities of clinical decision-making and tool integration.

Purpose of the Study:

  • To systematically benchmark the real-world performance of two agentic AI systems in healthcare settings.
  • To evaluate the efficacy of agentic AI across diverse medical tasks, including diagnostics, QA, and complex examinations.
  • To assess the trade-offs between performance gains, resource utilization, and hallucination rates in medical AI agents.

Main Methods:

  • Evaluated OpenManus (Llama-4 based) and Manus (proprietary multistep architecture) on AgentClinic, MedAgentsBench, and Humanity's Last Exam (HLE) benchmarks.
  • Assessed performance on text-based and multimodal medical question-answering and diagnostic simulations.
  • Quantified accuracy, token usage, latency, and hallucination rates, with in-agent safeguards.

Main Results:

  • Agentic AI systems provided modest accuracy improvements over baseline LLMs, with significant increases in token usage and latency.
  • Accuracy on AgentClinic MedQA reached 60.3%, MedAgentsBench 30.3%, and HLE text 8.6%.
  • Multimodal accuracy was low (15.5% on HLE, 29.2% on AgentClinic NEJM), and hallucinations persisted despite safeguards.

Conclusions:

  • Current agentic AI designs offer limited performance benefits in healthcare relative to their substantial computational and workflow costs.
  • There is a critical need for the development of more accurate, efficient, and clinically viable agent systems for medical applications.
  • Further research is required to optimize agentic AI architectures for practical healthcare deployment.