Automating the extraction of otology symptoms from clinic letters: a methodological study using natural language processing
View abstract on PubMed
Summary
This summary is machine-generated.This study developed natural language processing (NLP) models to automatically extract and contextualize otology symptoms from clinical notes. These models show promise for improving healthcare research and electronic health record analysis.
Area Of Science
- Medical Informatics
- Computational Linguistics
- Otolaryngology Research
Background
- Healthcare data is predominantly unstructured, hindering research scalability.
- Manual data processing is time-consuming and inefficient.
- Natural Language Processing (NLP) offers automated solutions for data extraction.
Purpose Of The Study
- To develop NLP models for extracting and contextualizing otology symptoms from free-text clinical documents.
- To automate the processing of unstructured otology-related healthcare data.
Main Methods
- A hybrid dictionary and machine learning NLP model was trained on 1,148 otology clinic letters.
- Bidirectional-Long-Short-Term-Memory (Bi-LSTM) models were used for symptom contextualization.
- Six key otological symptoms were targeted: hearing loss, balance impairment, otalgia, otorrhoea, tinnitus, and vertigo.
Main Results
- The symptom extraction model achieved a macro F1 score of 0.73.
- Bi-LSTM models achieved a mean macro F1 score of 0.69 for contextualization.
- 24% of patients presented with hearing loss, with 1,197 symptom and 2,861 contextual annotations.
Conclusions
- Successfully developed NLP models for otology symptom extraction and contextualization.
- Models demonstrate good performance on real-world clinical data.
- Potential applications include semantic searching, clinical trial cohort identification, and hearing loss phenotype research.

