Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Methods of Documentation VII: EMR01:30

Methods of Documentation VII: EMR

907
Electronic Medical Records (EMRs) primarily center around electronically documenting patients' health information within a single healthcare organization or practice. They contain essential clinical data related to a patient's medical history, diagnoses, medications, treatment plans, lab results, and other pertinent information relevant to the specific encounter or episode of care. EMRs are designed to streamline documentation and workflow processes within individual healthcare...
907
ER Retrieval Pathway01:45

ER Retrieval Pathway

3.9K
In the secretory pathway, vesicles transport proteins from one cellular compartment to another in forward transport to deliver the protein to its correct location. Occasionally, misfolded proteins and incorrect proteins escape their original compartments, and a retrieval pathway is used to return the escaped proteins to their original compartment.
The ER uses many checkpoints to prevent the entry of incorrectly folded or a resident protein as cargo onto a transport vesicle. These mechanisms...
3.9K
Issues And Trends In Healthcare Delivery System01:29

Issues And Trends In Healthcare Delivery System

5.8K
The issues and trends in healthcare delivery are constantly changing. The COVID-19 pandemic is one recent issue that wreaked havoc on healthcare systems, causing a shortage of healthcare workers, high demand for medicines and supplies, and increased medical expenditure due to a lack of insurance. Other issues include rising healthcare costs and care fragmentation.
Cost Containment
Payment for healthcare services has historically promoted adoption of costly and often unnecessary or inefficient...
5.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Synthesized annotation guidelines are knowledge-lite boosters for clinical information extraction.

Journal of the American Medical Informatics Association : JAMIA·2026
Same author

Risk of Malignancy with Immunosuppressive Drugs Used in Organ Transplants Compared to Those Used for Non-Transplant Indications.

Cancers·2026
Same author

Impact of Prescribed and Self-Selected Music Interventions on Stress, Sleep, Heart Rate Variability, and Brain Connectivity in Surgeons Using 7-Tesla Functional Magnetic Resonance Imaging and Wearable Actigraphy: Multimodal Feasibility Randomized Controlled Trial.

JMIR formative research·2026
Same author

A Qualitative Study of a Pilot of Clinician Perspectives on the Delivery of Medicare Annual Wellness Visits for Patients with Dementia in an Academic Health Science Center in Texas.

Health services insights·2026
Same author

Impact of annual wellness visits on preventing falls and fractures for Alzheimer's disease and related dementias older adults.

Age and ageing·2026
Same author

Regional Differences and Trends Within Texas in HPV Vaccination Among Medicaid-insured Adolescents.

Journal of primary care & community health·2026
Same journal

DataAtlas: automatic generation of data dictionaries using large language models.

JAMIA open·2026
Same journal

An examination of the availability and characteristics of social needs data in the electronic health records: a path to social data harmonization and standardization at Johns Hopkins medicine.

JAMIA open·2026
Same journal

Generative artificial intelligence implementation in REDCap.

JAMIA open·2026
Same journal

Improving readability of layperson abstracts and summaries in oncology using task-specific large language model powered tool: results from the BRIDGE-AI 7 study.

JAMIA open·2026
Same journal

Accuracy of administrative data in ascertaining health conditions: a systematic review.

JAMIA open·2026
Same journal

Building a consumer health informatics introductory course consensus curriculum: an eDelphi study.

JAMIA open·2026
See all related articles

Related Experiment Video

Updated: Sep 8, 2025

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts
07:50

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

Published on: September 20, 2018

16.0K

Deep learning-based NLP data pipeline for EHR-scanned document information extraction.

Enshuo Hsu1, Ioannis Malagaris1, Yong-Fang Kuo1

  • 1Department of Biostatistics and Data Science, University of Texas Medical Branch, Galveston, Texas, USA.

JAMIA Open
|June 15, 2022
PubMed
Summary
This summary is machine-generated.

Optimizing scanned health documents requires careful image processing and natural language processing (NLP). This study shows that combining image preprocessing with NLP models significantly improves the extraction of sleep apnea indicators from electronic health records (EHR).

Keywords:
electronic health recordsnatural language processingoptical character recognitionpolysomnographyscanned document

More Related Videos

TBase - an Integrated Electronic Health Record and Research Database for Kidney Transplant Recipients
09:00

TBase - an Integrated Electronic Health Record and Research Database for Kidney Transplant Recipients

Published on: April 13, 2021

4.7K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

565

Related Experiment Videos

Last Updated: Sep 8, 2025

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts
07:50

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

Published on: September 20, 2018

16.0K
TBase - an Integrated Electronic Health Record and Research Database for Kidney Transplant Recipients
09:00

TBase - an Integrated Electronic Health Record and Research Database for Kidney Transplant Recipients

Published on: April 13, 2021

4.7K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

565

Area of Science:

  • Medical Informatics
  • Natural Language Processing
  • Machine Learning

Background:

  • Scanned documents in electronic health records (EHR) present ongoing challenges for data extraction.
  • Existing methods combine image preprocessing, optical character recognition (OCR), and natural language processing (NLP), but their interactions are under-explored.
  • Accurate extraction of sleep apnea indicators like Apnea hypopnea index (AHI) and oxygen saturation (SaO2) from scanned reports is crucial for patient care.

Purpose of the Study:

  • To evaluate the impact of image preprocessing techniques and NLP models on the accuracy of extracting sleep apnea indicators from scanned EHR reports.
  • To investigate the role of document layout information in enhancing the performance of deep learning models for scanned document processing.
  • To optimize end-to-end performance for extracting AHI and SaO2 from sleep study reports.

Main Methods:

  • Evaluated 955 scanned sleep study reports for AHI and SaO2 extraction.
  • Applied various image preprocessing methods (gray-scaling, dilation, erosion, contrast) and OCR (Tesseract).
  • Compared seven traditional and three deep learning models, including architectures with and without structured document layout input, using ClinicalBERT for NLP.

Main Results:

  • The proposed method using ClinicalBERT achieved an AUROC of 0.9743 and 94.76% document accuracy for AHI.
  • For SaO2, the method reached an AUROC of 0.9523 and 91.61% document accuracy.
  • Demonstrated that image preprocessing and document layout information positively influence extraction performance.

Conclusions:

  • Effective information extraction from scanned EHR documents necessitates a multi-step approach involving optimized image processing and NLP.
  • The integration of image preprocessing and document layout awareness significantly benefits the performance of NLP models in analyzing scanned medical reports.
  • Developing robust NLP systems for scanned documents remains critical for leveraging valuable information within healthcare data.