Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Enhancing text categorization with semantic-enriched representation and training data augmentation.

Xinghua Lu1, Bin Zheng, Atulya Velivelli

  • 1Department of Biostatistics, Bioinformatics and Epidemiology, Charleston, SC 29425, USA. lux@musc.edu

Journal of the American Medical Informatics Association : JAMIA
|June 27, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Regulatory crosstalk between KLF5, miR-29a and Fbw7/CDC4 cooperatively promotes atherosclerotic development.

Biochimica et biophysica acta. Molecular basis of disease·2017
Same author

Near-Infrared Light Triggered Upconversion Optogenetic Nanosystem for Cancer Therapy.

ACS nano·2017
Same author

Non-contact method to freely control the radiation patterns of antenna with multi-folded transformation optics.

Scientific reports·2017
Same author

Cost Effectiveness of Imatinib, Dasatinib, and Nilotinib as First-Line Treatment for Chronic-Phase Chronic Myeloid Leukemia in China.

Clinical drug investigation·2017
Same author

Improving Performance of Breast Cancer Risk Prediction by Incorporating Optical Density Image Feature Analysis: An Assessment.

Academic radiology·2017
Same author

Preliminary study on ultrasound-guided prostate biopsy specimen scores.

Experimental and therapeutic medicine·2017
Same journal

Digital divide in clinical and operational artificial intelligence adoption and implementation stages: US hospital diffusion patterns and AI deserts.

Journal of the American Medical Informatics Association : JAMIA·2026
Same journal

Extending the fundamental theorem of biomedical informatics: a proposal and illustrative examples.

Journal of the American Medical Informatics Association : JAMIA·2026
Same journal

Human factors methods for designing safe health information technology: what do the experts think?

Journal of the American Medical Informatics Association : JAMIA·2026
Same journal

Equity-by-design for socially assistive robots as digital health tools.

Journal of the American Medical Informatics Association : JAMIA·2026
Same journal

Orchestrator multi-agent clinical decision support system for secondary headache diagnosis in primary care.

Journal of the American Medical Informatics Association : JAMIA·2026
Same journal

CUI-Curate: a GraphRAG-based framework for automated clinical concept curation for NLP applications.

Journal of the American Medical Informatics Association : JAMIA·2026
See all related articles

Improving biomedical text categorization involves semantic-enriched features and semi-supervised learning to augment data. These methods enhance information retrieval efficiency and performance for support vector machines (SVM).

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Information Retrieval

Background:

  • Biomedical knowledge acquisition requires efficient document retrieval from vast literature.
  • High dimensionality and limited annotated data hinder information retrieval algorithm performance.
  • Developing improved text categorization methods is crucial for bioinformatics.

Purpose of the Study:

  • To enhance text categorization performance in high-dimensional, sparse biomedical data.
  • To investigate semantic-preserving dimension reduction and semi-supervised learning for data augmentation.
  • To evaluate the impact of these methods on support vector machine (SVM) classification.

Main Methods:

  • Applied a probabilistic topic model for semantic topic extraction and dimension reduction.

Related Experiment Videos

  • Represented documents in a reduced-dimensionality semantic topic space.
  • Utilized a graph-based semi-supervised learning algorithm to augment training data with pseudo-positive cases.
  • Main Results:

    • Semantic-enriched data transformation reduced dimensionality while preserving key information.
    • Semi-supervised learning effectively augmented training datasets.
    • Both techniques significantly improved the efficiency and performance of SVM text categorization.

    Conclusions:

    • Semantic-enriched data transformation is effective for high-dimensional biomedical text.
    • Semi-supervised learning-based data augmentation enhances classifier training.
    • Combined approaches improve biomedical literature analysis and knowledge discovery.