Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

A data-driven approach for extracting "the most specific term" for ontology development.

Guergana K Savova¹, Marcelline Harris, Thomas Johnson

¹Division of Medical Informatics Research, Mayo Clinic, Rochester, MN, USA.

AMIA ... Annual Symposium Proceedings. AMIA Symposium

|January 20, 2004

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Large language models require a new form of oversight: capability-based monitoring.

NPJ digital medicine·2026

Same author

Multi-scale data improves performance of machine learning model for long COVID identification.

Communications medicine·2026

Same author

An agentic AI system enhances clinical detection of immunotherapy toxicities: a multi-phase validation study.

medRxiv : the preprint server for health sciences·2026

Same author

Governing real-world health data as a public utility.

Science (New York, N.Y.)·2026

Same author

LinkML: an open data modeling framework.

GigaScience·2025

Same author

Development of a robust corpus for automated evaluation of online health information in Chinese using the DISCERN scale.

Journal of the American Medical Informatics Association : JAMIA·2025

Same journal

Sensitivity Analyses of a Scoring System for a Contraception Decision Aid.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

Improving electronic health record processing of large language models via retrieval-augmented generation: A case study on dietary supplements.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

Developing a User-Centered Mobile Application Prototype: Bridging Lower-Limb Fracture Care from Skilled Nursing Facility and Back to the Community.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

Automating Adjudication of Cardiovascular Events Using Large Language Models.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

Predictive Factors and State-Level Barriers to Postpartum Birth Control Usage in the United States: Insights from PRAMS Phase 8.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

See all related articles

This study introduces a novel data-driven method to identify specific terms for functioning, disability, and health (FDH) ontologies. The approach combines statistical word frequency with advanced linguistic analysis for improved term extraction.

Area of Science:

Medical Informatics
Natural Language Processing
Health Informatics

Background:

Ontologies of functioning, disability, and health (FDH) are crucial for standardizing health information.
Extracting the most specific and relevant terms from clinical text is challenging.
Existing methods often focus narrowly on noun phrases, limiting their scope.

Purpose of the Study:

To develop and evaluate a data-driven algorithm for extracting highly specific terms relevant to FDH ontologies.
To extend existing term extraction techniques beyond simple noun phrases.
To assess the algorithm's performance on diverse clinical text datasets.

Main Methods:

A hybrid approach combining statistical content word frequency with a linguistic heuristic.

Related Experiment Videos

The linguistic heuristic identifies 'complete syntactic nodes' for broader applicability.

The algorithm was tested on two datasets: pain abstracts and actual medical reports, annotated by experts.

Main Results:

The algorithm demonstrated effectiveness in extracting relevant terms from clinical text.
Performance was evaluated using recall, precision, and F-score metrics.
Analysis included the rate of valid terms within false positives.

Conclusions:

The proposed data-driven method offers a robust way to extract specific terms for FDH ontologies.
The 'complete syntactic node' approach enhances flexibility across different syntactic structures.
Further validation with larger datasets is recommended to confirm generalizability.