Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome Annotation and Assembly03:36

Genome Annotation and Assembly

18.8K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
18.8K
Genomics02:02

Genomics

35.6K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
35.6K
Leaky Scanning02:28

Leaky Scanning

5.0K
During most eukaryotic translation processes, the small 40S ribosome subunit scans an mRNA from its 5' end until it encounters the first start AUG codon. The large 60S ribosomal subunit then joins the smaller one to initiate protein synthesis. The location of the translation initiation is largely determined by the nucleotides near the start codon as there may be multiple translation initiation sites present on the mRNA.  Marilyn Kozak discovered that the sequence RCCAUGG (where R...
5.0K
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

2.4K
2.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Synthesized annotation guidelines are knowledge-lite boosters for clinical information extraction.

Journal of the American Medical Informatics Association : JAMIA·2026
Same author

Education Research: Integration of Trainee and Faculty Clinics at an Academic Medical Center: Improving Quality of Care and Education.

Neurology. Education·2026
Same author

Hootation: A GUI and API library for ontology validation and verbalization.

Proceedings. IEEE International Conference on Semantic Computing·2026
Same author

Impact of Prescribed and Self-Selected Music Interventions on Stress, Sleep, Heart Rate Variability, and Brain Connectivity in Surgeons Using 7-Tesla Functional Magnetic Resonance Imaging and Wearable Actigraphy: Multimodal Feasibility Randomized Controlled Trial.

JMIR formative research·2026
Same author

Clinical document metadata extraction: A scoping review.

Journal of biomedical informatics·2026
Same author

Facilitating Clinical Information Extraction with Synthetic Data and Ontology using Large Language Models.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026
Same journal

DataAtlas: automatic generation of data dictionaries using large language models.

JAMIA open·2026
Same journal

An examination of the availability and characteristics of social needs data in the electronic health records: a path to social data harmonization and standardization at Johns Hopkins medicine.

JAMIA open·2026
Same journal

Generative artificial intelligence implementation in REDCap.

JAMIA open·2026
Same journal

Improving readability of layperson abstracts and summaries in oncology using task-specific large language model powered tool: results from the BRIDGE-AI 7 study.

JAMIA open·2026
Same journal

Accuracy of administrative data in ascertaining health conditions: a systematic review.

JAMIA open·2026
Same journal

Building a consumer health informatics introductory course consensus curriculum: an eDelphi study.

JAMIA open·2026
See all related articles

Related Experiment Video

Updated: May 22, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

474

LLM-IE: a python package for biomedical generative information extraction with large language models.

Enshuo Hsu1,2, Kirk Roberts1

  • 1McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, United States.

JAMIA Open
|March 13, 2025
PubMed
Summary
This summary is machine-generated.

A new Python package, LLM-IE, simplifies biomedical information extraction using large language models (LLMs). It offers tools for prompt engineering and building extraction pipelines, achieving over 70% F1 for entity extraction.

Keywords:
information extractionlarge language modelsnamed entity recognitionnatural language processingrelation extraction

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

616
High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions
14:58

High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

Published on: March 5, 2022

4.0K

Related Experiment Videos

Last Updated: May 22, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

474
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

616
High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions
14:58

High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

Published on: March 5, 2022

4.0K

Area of Science:

  • Biomedical Informatics
  • Natural Language Processing

Background:

  • Large language models (LLMs) show promise for biomedical information extraction (IE).
  • Existing challenges in prompt engineering and algorithm development limit LLM application in IE.
  • There is a lack of dedicated software for creating comprehensive IE pipelines.

Purpose of the Study:

  • To develop a user-friendly Python package, LLM-IE, for constructing end-to-end biomedical information extraction pipelines.
  • To address the persistent challenges in prompt engineering and algorithm design for LLM-based IE.
  • To provide essential building blocks for robust and efficient IE system development.

Main Methods:

  • Developed LLM-IE, a Python package supporting named entity recognition, entity attribute extraction, and relation extraction.
  • Implemented an interactive LLM agent for schema definition and prompt design.
  • Utilized state-of-the-art prompting algorithms and visualization features.
  • Benchmarked LLM-IE performance on the i2b2 clinical datasets.

Main Results:

  • The sentence-based prompting algorithm achieved over 70% strict F1 for entity extraction in an 8-shot setting.
  • The system demonstrated approximately 60% F1 for entity attribute extraction.
  • LLM-IE successfully supports key IE tasks including NER, entity attribute extraction, and relation extraction.

Conclusions:

  • LLM-IE provides a foundational toolkit for developing advanced biomedical information extraction pipelines.
  • The package facilitates schema definition, prompt design, and utilizes effective prompting algorithms.
  • Future work will focus on expanding LLM-IE's capabilities and enhancing its computational efficiency.