Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Jun 16, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K

Scaling sensor metadata extraction for exposure health using LLMs.

Fatemeh Shah-Mohammadi1, Sunho Im2, Julio C Facelli1,3

  • 1Department of Biomedical Informatics, The University of Utah, Salt Lake City, UT 84108, United State.

Exposome
|March 27, 2026
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

Calibration Curves: Linear Least Squares01:20

Calibration Curves: Linear Least Squares

A calibration curve is a plot of the instrument's response against a series of known concentrations of a substance. This curve is used to set the instrument response levels, using the substance and its concentrations as standards. Alternatively, or additionally, an equation is fitted to the calibration curve plot and subsequently used to calculate the unknown concentrations of other samples reliably.
For data that follow a straight line, the standard method for fitting is the linear...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Understanding uncertainty in large language model predictions of early death in critically ill patients: a conformal prediction approach.

JAMIA open·2026
Same author

Prescribing Trajectories in Type 2 Diabetes in the United States, 2019-2024.

Diabetes, obesity & metabolism·2026
Same author

Opportunities and Challenges in Using National EHR Networks for AI in Learning Health Systems.

Learning health systems·2026
Same author

Reliable Uncertainty Under Class Imbalance and Distribution Shift: Class-Conditional Conformal Prediction of Multiple Sclerosis.

medRxiv : the preprint server for health sciences·2026
Same author

Enhancing prediction of inpatient deterioration by combining clinical and nurse concern features, with or without temporal clustering.

JAMIA open·2026
Same author

Evaluating the indirect interaction between glucagon-like peptide-1 receptor agonists and warfarin using real-world data.

Journal of thrombosis and thrombolysis·2026
Same journal

Determinants of body mass index during early life: findings from an exposome-wide association study with follow-up replication and Mendelian randomization analyses.

Exposome·2026
Same journal

The Exposome journal on hiatus: not even a flesh wound.

Exposome·2026
Same journal

The spatial and contextual exposome and subtypes of hypertensive disorders of pregnancy: a double machine learning-based analysis.

Exposome·2026
Same journal

Inviting ecosystems into the exposome framework.

Exposome·2026
Same journal

Transient exposure to bisphenol F in early life affects the metabolic health of adults.

Exposome·2026
Same journal

The environmental chemical exposome and health insurance: Examining associations and effect modification of epigenetic aging in a representative sample of United States adults.

Exposome·2026
See all related articles

We developed a large language model (LLM) pipeline to automate sensor metadata extraction from research papers. This approach significantly improves efficiency and accuracy for exposure health research.

Area of Science:

  • Environmental health
  • Data science
  • Bioinformatics

Background:

  • Sensor technologies are rapidly evolving, creating diverse data formats.
  • Inconsistent sensor metadata reporting hinders exposome and exposure health research.
  • Manual extraction of sensor metadata from literature is unscalable.

Purpose of the Study:

  • To develop and evaluate a large language model (LLM)-based pipeline for automating sensor metadata extraction.
  • To address the bottleneck of manual metadata extraction from unstructured sources.
  • To harmonize sensor metadata into structured formats for exposure health research.

Main Methods:

  • Utilized GPT-4 in a zero-shot setting to construct the LLM pipeline.
  • Developed a pipeline to parse full-text PDFs for sensor metadata extraction.
Keywords:
GPTexposure healthinformation extractionmetadatasensor

Related Experiment Videos

Last Updated: Jun 16, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K
  • Implemented harmonization of extracted metadata into structured formats.
  • Main Results:

    • The automated pipeline demonstrated substantial efficiency gains over manual review.
    • Achieved high performance metrics: 88.0% accuracy, 88.0% precision, 93.0% recall, and 90.0% F1-score.
    • Successfully extracted and harmonized sensor metadata from exposure health literature.

    Conclusions:

    • LLM-driven pipelines are feasible and scalable for automating sensor metadata extraction in exposure health.
    • This automation reduces manual burden and enhances metadata completeness and consistency.
    • Findings support integrating LLM pipelines into exposure health informatics platforms.