Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Automated Echocardiographic Detection of Congenital Heart Disease Using Artificial Intelligence.

Circulation·2026

Same author

How Far Have Large Language Models Advanced in Ophthalmology? A Systematic Review of Their Development, Evaluation, and Readiness for Clinical Use.

Research square·2026

Same author

Modeling study of the suppression mechanism of acoustic liners on the thermoacoustic limit cycle oscillation in a Rijke tube.

The Journal of the Acoustical Society of America·2026

Same author

Analyzing Information Disparities across Modalities in Mortality Prediction.

medRxiv : the preprint server for health sciences·2025

Same author

Toward digital twins in the intensive care unit: a medication management case study.

Journal of the American Medical Informatics Association : JAMIA·2025

Same author

Enhanced value of chest computed tomography radiomics features in breast density classification.

Scientific reports·2025

Same journal

DataAtlas: automatic generation of data dictionaries using large language models.

JAMIA open·2026

Same journal

An examination of the availability and characteristics of social needs data in the electronic health records: a path to social data harmonization and standardization at Johns Hopkins medicine.

JAMIA open·2026

Same journal

Generative artificial intelligence implementation in REDCap.

JAMIA open·2026

Same journal

Improving readability of layperson abstracts and summaries in oncology using task-specific large language model powered tool: results from the BRIDGE-AI 7 study.

JAMIA open·2026

Same journal

Accuracy of administrative data in ascertaining health conditions: a systematic review.

JAMIA open·2026

Same journal

Building a consumer health informatics introductory course consensus curriculum: an eDelphi study.

JAMIA open·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 17, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Generalizable clinical note section identification with large language models.

Weipeng Zhou¹, Timothy A Miller^2,3

¹Department of Biomedical Informatics and Medical Education, School of Medicine, University of Washington-Seattle, Seattle, WA 98195, United States.

|August 14, 2024

Summary

This summary is machine-generated.

Large language models (LLMs) show promise for clinical note section identification, with GPT-4 achieving high accuracy. Fine-tuning with specific examples further enhances performance, making LLMs nearly production-ready for this task.

Keywords:

ChatGPT GPT4 fine-tuning large language models section identification

More Related Videos

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

Published on: September 20, 2018

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

Related Experiment Videos

Last Updated: Jun 17, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

Published on: September 20, 2018

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

Area of Science:

Natural Language Processing
Clinical Informatics
Artificial Intelligence in Healthcare

Background:

Clinical note section identification is crucial for information retrieval and downstream NLP tasks.
Traditional supervised methods face challenges with transferability across different clinical datasets.
Large language models (LLMs) offer a potential solution to overcome these limitations.

Purpose of the Study:

To evaluate the effectiveness of LLMs for clinical note section identification.
To compare the performance of various LLMs, including GPT-4, GPT-3.5, and open-source models.
To investigate the impact of fine-tuning dataset size and specificity on LLM performance.

Main Methods:

Framed section identification as a question-answering task using free-text section definitions.
Evaluated multiple LLMs off-the-shelf without prior training.
Fine-tuned selected LLMs using datasets of varying sizes and specificities.

Main Results:

GPT-4 achieved the highest F1 score (0.77), outperforming other models.
GPT-4 demonstrated high accuracy for specific section types (F1 > 0.9 for 33%, F1 > 0.8 for 56%).
Fine-tuned models showed diminishing returns with larger general datasets but improved with specific section identification examples.

Conclusions:

LLMs, particularly GPT-4, are highly promising for generalizable clinical note section identification and are nearing production readiness.
Open-source LLMs are rapidly improving and approaching the performance of leading proprietary models.
Further improvements can be achieved by incorporating section identification examples into LLM fine-tuning datasets.