Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Evaluation of preprocessing techniques for chief complaint classification.

Jagan Dara¹, John N Dowling, Debbie Travers

¹Department of Biomedical Informatics, University of Pittsburgh, 200 Meyran Avenue, VALE M-183, Pittsburgh, PA 15260, USA.

Journal of Biomedical Informatics

|January 2, 2008

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

The use of logic for machine learning models in sepsis.

Intensive care medicine experimental·2026

Same author

A Decision-Theoretic Perspective on Fairness in Clinical Predictive Models.

Research square·2026

Same author

Causal modeling reveals cell-cell communication dynamics in the tumor microenvironment during anti-PD-1 therapy in breast cancer patients.

Briefings in bioinformatics·2026

Same author

An evaluation of a Bayesian method to track outbreaks of known and novel influenza-like illnesses.

Scientific reports·2026

Same author

Leveraging Expert Knowledge and Causal Structure Learning to Build Parsimonious Models of Acute Brain Dysfunction in the Pediatric Intensive Care Unit (PICU).

medRxiv : the preprint server for health sciences·2026

Same author

Reply to Eccleston and Moore.

Pain·2026

Same journal

Evaluation of temporal preservation in synthetic longitudinal patient data.

Journal of biomedical informatics·2026

Same journal

ARKE: An ontology-driven framework for automated mapping of local radiology procedure terms to the LOINC-RadLex playbook using large language model.

Journal of biomedical informatics·2026

Same journal

A validation-driven training controller for cross-lingual biomedical NER via reinforcement learning-based adaptive loss weighting.

Journal of biomedical informatics·2026

Same journal

ASP-HR: An Adaptive Spatial Perception and Hierarchical Reasoning mechanism for document-level biomedical relation extraction.

Journal of biomedical informatics·2026

Same journal

Beyond Accuracy: Safety-Centered guidelines for the evaluation of LLM-based therapy recommendation systems for chronic multimorbidity patients.

Journal of biomedical informatics·2026

Same journal

DeepEN: A deep reinforcement learning framework for personalized enteral nutrition in critical care.

Journal of biomedical informatics·2026

See all related articles

Preprocessing chief complaints by splitting them into multiple problems significantly improved syndromic classification performance for the CoCo classifier. Other preprocessing methods offered minimal gains for syndromic classification accuracy.

Area of Science:

Public Health
Health Informatics
Computational Linguistics

Background:

Automated classification of chief complaints is crucial for public health surveillance.
Preprocessing steps can potentially enhance the accuracy of these classifications.
The impact of different preprocessing techniques on syndromic classification performance requires further investigation.

Purpose of the Study:

To evaluate the impact of preprocessing chief complaints on automated syndromic classification performance.
To compare the effectiveness of two preprocessing methods (CCP and EMT-P) with different classifiers (CoCo and KC).

Main Methods:

Chief complaints were preprocessed using two methods: CCP and EMT-P.
Classification performance was assessed using a probabilistic classifier (CoCo) and a keyword-based classifier (KC).

Related Experiment Videos

Evaluated improvements in classification accuracy and sensitivity for various syndromes.

Main Results:

CCP preprocessing showed 85% accuracy but yielded minor improvements for CoCo.
EMT-P preprocessing, which segments complaints into multiple issues, substantially boosted CoCo's sensitivity across all syndromes.
Both CCP and EMT-P minimally improved KC's sensitivity, primarily for the Constitutional syndrome.

Conclusions:

Preprocessing effectiveness for syndromic classification should consider its impact on downstream classification tasks, not just preprocessor accuracy.
Segmenting chief complaints into multiple problems is vital for enhancing probabilistic classifier performance.
The benefits of other preprocessing steps on classification performance were limited for both probabilistic and keyword-based systems.