Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Jun 28, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Optimizing feature representation for automated systematic review work prioritization.

Aaron M Cohen1

  • 1Department of Medical Informatics and Clinical Epidemiology,Oregon Health & Science University, Portland, Oregon, USA.

AMIA ... Annual Symposium Proceedings. AMIA Symposium
|November 13, 2008
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

Methods of Medium Optimization01:28

Methods of Medium Optimization

Optimizing growth media enhances microbial proliferation and maximizes product yield. Statistical experimental design methodologies provide structured and reproducible approaches, offering progressively higher levels of robustness and efficiency.The One-Factor-at-a-Time (OFAT) MethodThe One-Factor-at-a-Time (OFAT) method involves adjusting a single variable while keeping all others constant. However, it cannot detect interactions between variables, often leading to suboptimal outcomes when...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Automatically pre-screening patients for the rare disease aromatic l-amino acid decarboxylase deficiency using knowledge engineering, natural language processing, and machine learning on a large EHR population.

Journal of the American Medical Informatics Association : JAMIA·2023
Same author

Integrative analysis of drug response and clinical outcome in acute myeloid leukemia.

Cancer cell·2022
Same author

Testing a filtering strategy for systematic reviews: evaluating work savings and recall.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2022
Same author

Clinical study applying machine learning to detect a rare disease: results and lessons learned.

JAMIA open·2022
Same author

Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews.

JAMIA open·2022
Same author

An Analysis of Two Sources of Cardiology Patient Data to Measure Medication Agreement.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2021
Same journal

Sensitivity Analyses of a Scoring System for a Contraception Decision Aid.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026
Same journal

Improving electronic health record processing of large language models via retrieval-augmented generation: A case study on dietary supplements.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026
Same journal

Developing a User-Centered Mobile Application Prototype: Bridging Lower-Limb Fracture Care from Skilled Nursing Facility and Back to the Community.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026
Same journal

KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026
Same journal

Automating Adjudication of Cardiovascular Events Using Large Language Models.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026
Same journal

Predictive Factors and State-Level Barriers to Postpartum Birth Control Usage in the United States: Insights from PRAMS Phase 8.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026
See all related articles

Automated document classification aids systematic reviews (SRs) by prioritizing documents. Combining n-gram and MeSH features proved most effective, outperforming NLP methods, with topic-specific data enhancing performance.

Area of Science:

  • Medical Informatics
  • Evidence-Based Medicine
  • Information Retrieval

Background:

  • Systematic reviews (SRs) are crucial for evidence-based medicine but are labor-intensive.
  • Automated document classification can improve the efficiency of SRs by prioritizing documents.
  • Prioritization ensures that the most relevant documents are processed first, saving time and resources.

Purpose of the Study:

  • To evaluate different feature systems for automated document classification in the context of SRs.
  • To compare the effectiveness of unigram, n-gram, MeSH, and NLP features.
  • To assess the impact of topic-specific training data versus general SR inclusion data on classification performance.

Main Methods:

  • Evaluated multiple classification feature systems: unigram, n-gram, Medical Subject Headings (MeSH), and natural language processing (NLP).

Related Experiment Videos

Last Updated: Jun 28, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

  • Tested feature systems on 15 systematic review tasks.
  • Used the area under the receiver operating curve (AUC) as the primary performance metric.
  • Compared the performance using topic-specific training data against general SR inclusion data.
  • Main Results:

    • The optimal feature set combined n-gram and MeSH features.
    • Natural language processing (NLP)-based features did not enhance classification performance.
    • Topic-specific training data generally resulted in a significant performance improvement compared to general SR training data.

    Conclusions:

    • A hybrid approach combining n-gram and MeSH features is most effective for document prioritization in SRs.
    • NLP features do not currently offer advantages for this specific task.
    • Utilizing topic-specific training data is crucial for maximizing the performance of automated document classification in SR workflows.