Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Nursing Evaluation01:15

Nursing Evaluation

4.1K
The evaluation stage signals the end of the nursing process. The nurse gathers evaluative data to assess whether or not the patient has attained the expected results. Whereas the nurse collects data in the nursing assessment to identify the patient's health concerns, the evaluation stage data determines if the indicated health issues are resolved. Evaluative data collection includes two sections: the data acquired to evaluate patient outcomes and the time criteria for data collection.
4.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Telemonitoring in Inflammatory Bowel Disease: Findings from the TIGE-Rus Randomized Controlled Trial.

Journal of clinical medicine·2026
Same author

RiTex: Harmonization of Radiomic Features Based on Riemannian Geometry.

Journal of imaging·2026
Same author

Comparison of artificial intelligence (AI) services for Breast Imaging-Reporting and Data System (BI-RADS) classification on mammograms.

Quantitative imaging in medicine and surgery·2026
Same author

Key aspects of fine-tuning and applying LLM-as-a-judge for clinical data summaries in the radiological workflow.

Frontiers in artificial intelligence·2026
Same author

Design and validation of a technology for 3D printing training phantoms for ultrasound imaging.

Physical and engineering sciences in medicine·2025
Same author

Pilot Exploratory Study of a CT Radiomics Model for the Classification of Small Cell Lung Cancer and Non-Small-Cell Lung Cancer in the Moscow Population: A Step Toward Virtual Biopsy.

Journal of imaging·2025
Same journal

Correction: Luca et al. Global and Regional Diagnostic Results of Progress Toward Cervical Cancer Elimination, According to the WHO Strategy: A Systematic Literature Review with Narrative Synthesis. <i>Diagnostics</i> 2026, <i>16</i>, 1224.

Diagnostics (Basel, Switzerland)·2026
Same journal

Association Between Systemic Inflammatory Response Biomarkers and Disease Activity in Systemic Lupus Erythematosus: A Multi-Center Retrospective Study.

Diagnostics (Basel, Switzerland)·2026
Same journal

Vertebrogenic Low Back Pain and Basivertebral Nerve Ablation: A Review of Mechanisms, Imaging-Driven Selection, and Clinical Outcomes.

Diagnostics (Basel, Switzerland)·2026
Same journal

Multivalvular Carcinoid Heart Disease: The Role of Echocardiography in Diagnosis and Selection for Heterotopic Bicaval Valve Implantation.

Diagnostics (Basel, Switzerland)·2026
Same journal

Data-Efficient and Explainable Multimodal Survival Prediction in NSCLC Using Deep Image Embeddings, Clinical Variables, and Gradient-Boosted Trees.

Diagnostics (Basel, Switzerland)·2026
Same journal

Anomalous Left Coronary Artery from the Pulmonary Artery: Cinematic Volume Rendering Technique for Enhanced Anatomic Visualization.

Diagnostics (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Jan 13, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K

Evaluating Medical Text Summaries Using Automatic Evaluation Metrics and LLM-as-a-Judge Approach: A Pilot Study.

Yuriy Vasilev1, Irina Raznitsyna1, Anastasia Pamova1,2

  • 1Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department, 127051 Moscow, Russia.

Diagnostics (Basel, Switzerland)
|January 10, 2026
PubMed
Summary
This summary is machine-generated.

Large Language Models (LLMs) show promise for summarizing electronic health records (EHRs). However, automated quality control methods, including LLM-as-a-judge, struggle to detect factual errors, necessitating expert review.

Keywords:
LLM-as-a-judgeelectronic health recordslarge language modelsummaries

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications
09:20

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

9.2K

Related Experiment Videos

Last Updated: Jan 13, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications
09:20

Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications

Published on: February 23, 2019

9.2K

Area of Science:

  • Medical Informatics
  • Artificial Intelligence in Healthcare

Background:

  • Electronic health records (EHRs) contain vital clinical data but are challenging to process.
  • Large Language Models (LLMs) offer a promising solution for summarizing EHR data to aid physicians.
  • Automated quality control is essential for integrating LLM summarization tools into clinical practice.

Purpose of the Study:

  • To assess the feasibility and limitations of automated quality control for LLM-generated medical summaries.
  • To evaluate automatic metrics and LLM-as-a-judge approaches without expert involvement.

Main Methods:

  • Six open-source LLMs generated summaries from 30 EHR text samples.
  • Summaries were evaluated using standard metrics (BLEU, ROUGE, METEOR, BERTScore) and LLM-as-a-judge.
  • Criteria included relevance, completeness, redundancy, coherence, grammar, terminology, and hallucination detection.
  • Expert evaluation was performed using the same criteria for comparison.

Main Results:

  • LLMs demonstrate significant potential for summarizing medical data.
  • Neither automatic metrics nor LLM judges reliably detect factual errors or semantic distortions (hallucinations).
  • A Pearson correlation of 0.688 was observed between LLM summary quality scores and expert opinions regarding relevance.

Conclusions:

  • Fully automating the quality evaluation of medical summaries remains a significant challenge.
  • Future research should prioritize hallucination detection methods and explore larger, specialized LLMs for medical text.
  • Integrating retrieval-augmented generation (RAG) into LLM-as-a-judge architectures warrants further investigation.