Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Exploring supervised and unsupervised methods to detect topics in biomedical text.

Minsuk Lee1, Weiqing Wang, Hong Yu

  • 1Department of Biomedical Informatics, Columbia University, 622West, 168th Street, VC-5, NY 10032, USA. ml1065@columbia.edu

BMC Bioinformatics
|March 17, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Multifunctional magnetic-fluorescent eccentric-(concentric-Fe₃O₄@SiO₂@polyacrylic acid core-shell nanocomposites for cell imaging and pH-responsive drug delivery.

Nanoscale·2013
Same author

Two-dimensional fluorescence in-gel electrophoresis of coronary restenosis tissues in minipigs: increased adipocyte fatty acid binding protein induces reactive oxygen species-mediated growth and migration in smooth muscle cells.

Arteriosclerosis, thrombosis, and vascular biology·2013
Same author

Purification and characterization of mutant miniPlasmin for thrombolytic therapy.

Thrombosis journal·2013
Same author

[Mechanisms of resistance to EML4-ALK inhibitors in non-small cell lung cancer].

Zhongguo fei ai za zhi = Chinese journal of lung cancer·2013
Same author

NK4 gene therapy inhibits HGF/Met-induced growth of human cholangiocarcinoma cells.

Digestive diseases and sciences·2013
Same author

[Low-grade extraskeletal osteosarcoma of mediastinum: report of a case].

Zhonghua bing li xue za zhi = Chinese journal of pathology·2013
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

We explored topic detection in scientific literature using machine learning. Supervised methods like Naïve Bayes showed higher accuracy, outperforming unsupervised clustering for biological topic identification.

Area of Science:

  • Computational Biology
  • Bioinformatics
  • Natural Language Processing

Background:

  • Topic detection identifies scientific subjects in articles, crucial for biological information retrieval and literature analysis.
  • Efficiently searching vast biological literature requires robust topic detection methods.

Purpose of the Study:

  • To evaluate supervised (Topic Spotting) and unsupervised (Topic Clustering) methods for scientific topic detection.
  • To enhance topic detection performance by incorporating semantic features like MeSH and UMLS.

Main Methods:

  • Applied Naïve Bayes for supervised Topic Spotting and hierarchical clustering for unsupervised Topic Clustering.
  • Utilized bag-of-words, Medical Subject Headings (MeSH), and Unified Medical Language System (UMLS) semantic types as features.

Related Experiment Videos

  • Tested methods on over 15,000 articles linked to the Online Mendelian Inheritance in Man (OMIM) database.
  • Main Results:

    • Supervised Topic Spotting using Naïve Bayes achieved the highest accuracy (66.4%) in predicting 25 OMIM topics.
    • Incorporating MeSH terms and UMLS semantic types as features significantly improved topic detection performance.
    • Bag-of-words combined with semantic features outperformed bag-of-words alone.

    Conclusions:

    • Supervised topic spotting methods demonstrated superior performance over unsupervised topic clustering for this task.
    • Unsupervised topic clustering offers robustness and real-world applicability despite lower accuracy in this study.