Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Medical text representations for inductive learning.

A Wilcox1, G Hripcsak

  • 1Department of Medical Informatics, Columbia University, New York, NY, USA.

Proceedings. AMIA Symposium
|November 18, 2000
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Epigenetic Signatures in Monozygotic and Dizygotic Twins Discordant for Orofacial Clefts.

medRxiv : the preprint server for health sciences·2026
Same author

Controlled evaLuation of Angiotensin Receptor Blockers for COVID-19 respIraTorY disease (CLARITY): statistical analysis plan for a randomised controlled Bayesian adaptive sample size trial.

Trials·2022
Same author

MAO inhibitory activity of bromo-2-phenylbenzofurans: synthesis, <i>in vitro</i> study, and docking calculations.

MedChemComm·2018
Same author

Estimating summary statistics for electronic health record laboratory data for use in high-throughput phenotyping algorithms.

Journal of biomedical informatics·2018
Same author

A missense mutation in Katnal1 underlies behavioural, neurological and ciliary anomalies.

Molecular psychiatry·2017
Same author

The effects of host plant defoliation and fertilizer application on larval growth and oviposition behaviour in cinnabar moth.

Oecologia·2017
Same journal

Progressive display of very high resolution images using wavelets.

Proceedings. AMIA Symposium·2002
Same journal

The Chronus II temporal database mediator.

Proceedings. AMIA Symposium·2002
Same journal

Gene expression levels in different stages of progression in oral squamous cell carcinoma.

Proceedings. AMIA Symposium·2002
Same journal

An assessment of the visibility of MeSH-indexed medical web catalogs through search engines.

Proceedings. AMIA Symposium·2002
Same journal

Filtering for medical news items using a machine learning approach.

Proceedings. AMIA Symposium·2002
Same journal

Enriching the structure of the UMLS semantic network.

Proceedings. AMIA Symposium·2002
See all related articles

Choosing the right data representation is crucial for inductive learning algorithms classifying medical text. Explicitly capturing status information significantly improves classification performance, highlighting the impact of subtle data representation differences.

Area of Science:

  • Medical Informatics
  • Machine Learning
  • Natural Language Processing

Background:

  • Inductive learning algorithms are increasingly used for medical text classification.
  • Various text representation techniques exist, with subtle differences potentially impacting performance.

Purpose of the Study:

  • To evaluate the impact of different data representation techniques on medical text classification performance.
  • To identify which data representations are most effective for standard machine learning algorithms.

Main Methods:

  • Examined 8 distinct data representation techniques for medical text.
  • Evaluated these representations using standard machine learning algorithms.
  • Quantified the loss of classification-relevant information for each representation.

Related Experiment Videos

Main Results:

  • Representations explicitly capturing status information yielded significantly better classification performance.
  • Algorithm performance demonstrated sensitivity to subtle variations in data representation.
  • Significant differences in performance were observed across the 8 evaluated techniques.

Conclusions:

  • Data representation is a critical factor influencing the success of inductive learning in medical text classification.
  • Explicitly representing status information is a key strategy for improving classification accuracy.
  • Further research into optimal data representation for medical text is warranted.