Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Jun 28, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Optimizing feature representation for automated systematic review work prioritization.

Aaron M Cohen¹

¹Department of Medical Informatics and Clinical Epidemiology,Oregon Health & Science University, Portland, Oregon, USA.

AMIA ... Annual Symposium Proceedings. AMIA Symposium

|November 13, 2008

Summary

This summary is machine-generated.

Related Concept Videos

Methods of Medium Optimization

Methods of Medium Optimization

Optimizing growth media enhances microbial proliferation and maximizes product yield. Statistical experimental design methodologies provide structured and reproducible approaches, offering progressively higher levels of robustness and efficiency.The One-Factor-at-a-Time (OFAT) MethodThe One-Factor-at-a-Time (OFAT) method involves adjusting a single variable while keeping all others constant. However, it cannot detect interactions between variables, often leading to suboptimal outcomes when...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Automatically pre-screening patients for the rare disease aromatic l-amino acid decarboxylase deficiency using knowledge engineering, natural language processing, and machine learning on a large EHR population.

Journal of the American Medical Informatics Association : JAMIA·2023

Same author

Integrative analysis of drug response and clinical outcome in acute myeloid leukemia.

Cancer cell·2022

Same author

Testing a filtering strategy for systematic reviews: evaluating work savings and recall.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2022

Same author

Clinical study applying machine learning to detect a rare disease: results and lessons learned.

JAMIA open·2022

Same author

Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews.

JAMIA open·2022

Same author

An Analysis of Two Sources of Cardiology Patient Data to Measure Medication Agreement.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2021

Same journal

Sensitivity Analyses of a Scoring System for a Contraception Decision Aid.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

Improving electronic health record processing of large language models via retrieval-augmented generation: A case study on dietary supplements.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

Developing a User-Centered Mobile Application Prototype: Bridging Lower-Limb Fracture Care from Skilled Nursing Facility and Back to the Community.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

Automating Adjudication of Cardiovascular Events Using Large Language Models.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

Same journal

Predictive Factors and State-Level Barriers to Postpartum Birth Control Usage in the United States: Insights from PRAMS Phase 8.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2026

See all related articles

Automated document classification aids systematic reviews (SRs) by prioritizing documents. Combining n-gram and MeSH features proved most effective, outperforming NLP methods, with topic-specific data enhancing performance.

Area of Science:

Medical Informatics
Evidence-Based Medicine
Information Retrieval

Background:

Systematic reviews (SRs) are crucial for evidence-based medicine but are labor-intensive.
Automated document classification can improve the efficiency of SRs by prioritizing documents.
Prioritization ensures that the most relevant documents are processed first, saving time and resources.

Purpose of the Study:

To evaluate different feature systems for automated document classification in the context of SRs.
To compare the effectiveness of unigram, n-gram, MeSH, and NLP features.
To assess the impact of topic-specific training data versus general SR inclusion data on classification performance.

Main Methods:

Evaluated multiple classification feature systems: unigram, n-gram, Medical Subject Headings (MeSH), and natural language processing (NLP).

Related Experiment Videos

Last Updated: Jun 28, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Tested feature systems on 15 systematic review tasks.

Used the area under the receiver operating curve (AUC) as the primary performance metric.

Compared the performance using topic-specific training data against general SR inclusion data.

Main Results:

The optimal feature set combined n-gram and MeSH features.
Natural language processing (NLP)-based features did not enhance classification performance.
Topic-specific training data generally resulted in a significant performance improvement compared to general SR training data.

Conclusions:

A hybrid approach combining n-gram and MeSH features is most effective for document prioritization in SRs.
NLP features do not currently offer advantages for this specific task.
Utilizing topic-specific training data is crucial for maximizing the performance of automated document classification in SR workflows.