Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Using distributional analysis to semantically classify UMLS concepts.

Jung-Wei Fan¹, Hua Xu, Carol Friedman

¹Department of Biomedical Informatics, Columbia University, USA. fan@dbmi.columbia.edu

Studies in Health Technology and Informatics

|October 4, 2007

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Ensembles of natural language processing systems for portable phenotyping solutions.

Journal of biomedical informatics·2019

Same author

Crowdsourcing Public Opinion for Sharing Medical Records for the Advancement of Science.

Studies in health technology and informatics·2019

Same author

A two-site survey of medical center personnel's willingness to share clinical data for research: implications for reproducible health NLP research.

BMC medical informatics and decision making·2019

Same author

Validation of the Behavior of a Knowledge Base Implementing Clinical Guidelines for Point-of-Care Antiretroviral Toxicity Monitoring.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2019

Same author

Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes.

American journal of human genetics·2018

Same author

Pharmacogenomic Approaches for Automated Medication Risk Assessment in People with Polypharmacy.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2018

Same journal

A GenAI Pipeline for Violinist Kinematic Data Management.

Studies in health technology and informatics·2026

Same journal

AMAL-For-Qatar: A Comprehensive AI Ecosystem for Fetal Ultrasound Analysis - Project Overview and Achievements.

Studies in health technology and informatics·2026

Same journal

Longitudinal Treatment-Aware Multimodal AI for Dermatology: A Scoping Review.

Studies in health technology and informatics·2026

Same journal

Predicting Postpartum Depression Using Imbalance-Aware Machine Learning.

Studies in health technology and informatics·2026

Same journal

Validation of Deep-Learning Models for Autosegmentation of Brain Metastases.

Studies in health technology and informatics·2026

Same journal

Delay-Dependent Gating in Modular RNNs.

Studies in health technology and informatics·2026

See all related articles

This study developed an automated method to improve the accuracy of biomedical concept classification within the Unified Medical Language System (UMLS). The approach enhances Natural Language Processing (NLP) by refining semantic categorization for better system performance.

Area of Science:

Biomedical Informatics
Natural Language Processing
Computational Linguistics

Background:

The Unified Medical Language System (UMLS) is a critical resource for biomedical Natural Language Processing (NLP).
Inaccurate semantic classification within the UMLS can negatively impact the performance of NLP and knowledge-based systems.
Automated validation and reclassification of UMLS concepts are needed to enhance data accuracy.

Purpose of the Study:

To develop and evaluate an automated method for semantically classifying UMLS concepts.
To distinguish between biologic functions and disorders within the T033 Finding class.
To improve the accuracy of UMLS semantic categorization for NLP applications.

Main Methods:

Applied a distributional similarity method utilizing syntactic dependencies and -skew divergence.

Related Experiment Videos

Classified concepts within the UMLS T033 Finding semantic category.

Created a gold standard dataset through expert annotation of 100 randomly sampled concepts.

Main Results:

The top prediction achieved a precision of 0.54 and recall of 0.654.
Incorporating the top 2 predictions improved performance to a precision of 0.64 and recall of 0.769.
Error analysis identified limitations and areas for future methodological refinement.

Conclusions:

The developed distributional similarity method shows promise for automated UMLS concept reclassification.
Further improvements are necessary to address identified errors and enhance classification accuracy.
Accurate semantic classification is crucial for advancing biomedical NLP and knowledge systems.