Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Using compound codes for automatic classification of clinical diagnoses.

Serguei V Pakhomov1, James D Buntrock, Christopher G Chute

  • 1Division of Medical Informatics Research, Department of Health Sciences Research, Mayo Clinic, 200 SW First Street, Rochester, MN 55905, USA. Pakhomov.serguei @mayo.edu

Studies in Health Technology and Informatics
|September 14, 2004
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Multi-scale data improves performance of machine learning model for long COVID identification.

Communications medicine·2026
Same author

Governing real-world health data as a public utility.

Science (New York, N.Y.)·2026
Same author

LinkML: an open data modeling framework.

GigaScience·2025
Same author

Development of a robust corpus for automated evaluation of online health information in Chinese using the DISCERN scale.

Journal of the American Medical Informatics Association : JAMIA·2025
Same author

Advancing the science of genomic learning healthcare systems.

Learning health systems·2025
Same author

Mondo: integrating disease terminology across communities.

Genetics·2025
Same journal

The Essential Components and Critical Conditions for Success in a Learning Health System in Oncology.

Studies in health technology and informatics·2026
Same journal

Use of Artificial Intelligence in Screening for Adolescent Idiopathic Scoliosis: A Scoping Review.

Studies in health technology and informatics·2026
Same journal

Movement Related Biomechanics in Adolescent Idiopathic Scoliosis: A Review of Reviews.

Studies in health technology and informatics·2026
Same journal

The Impact of Surgical Correction of Adolescent Idiopathic Scoliosis Using Posterior Spinal Fusion on Selected Radiological Parameters and Respiratory Function.

Studies in health technology and informatics·2026
Same journal

Acute Effect of Physio-logic® Exercises on Muscle Tone and Stiffness in Adolescent Idiopathic Scoliosis Patients: A Preliminary Study.

Studies in health technology and informatics·2026
Same journal

Effects of Integrated Music and Occupational Therapy on Motor and Autonomic Function in Children with Neurogenic Scoliosis.

Studies in health technology and informatics·2026
See all related articles

This study introduces a method for efficiently classifying medical diagnoses using Naïve Bayes. Utilizing compound categories for multiple diagnoses slightly improved classification accuracy, suggesting a promising avenue for automated medical coding.

Area of Science:

  • Medical Informatics
  • Natural Language Processing
  • Machine Learning

Background:

  • Medical diagnosis classification is crucial for information retrieval but is labor-intensive and error-prone due to large code sets.
  • Existing systems struggle with diagnostic statements containing multiple codes, leading to multi-class classification challenges.

Purpose of the Study:

  • To develop a streamlined methodology for cleaning and reusing manually coded diagnostic statements.
  • To build predictive models for automated diagnosis classification using a Naïve Bayes classifier.
  • To address multi-class classification problems by investigating the use of compound (multiple code) categories.

Main Methods:

  • A sparse-feature implementation of the Naïve Bayes classifier was employed.
  • Manually coded diagnostic statements from clinical notes were cleaned and reused.

Related Experiment Videos

  • Diagnostic strings were classified into top-level categories, including an investigation into compound categories.
  • Main Results:

    • Experimental classification of over 16,000 diagnostic strings into 19 top-level categories was performed.
    • A 3% improvement in classification accuracy was observed when using compound categories compared to simple categories.
    • The results indicate that compound categories offer a promising approach for multi-code diagnostic statements.

    Conclusions:

    • The proposed methodology offers a simpler way to build predictive models for diagnosis classification.
    • The use of compound categories shows potential for improving the accuracy of automated medical coding systems.
    • Further research and refinement are needed to fully optimize the use of compound categories in medical classification.