Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Automating document classification for the Immune Epitope Database.

Peng Wang1, Alexander A Morgan, Qing Zhang

  • 1The La Jolla Institute for Allergy and Immunology, La Jolla, CA 92037, USA. pwang@liai.org <pwang@liai.org>

BMC Bioinformatics
|July 28, 2007
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

HLA tapasin independence: broader peptide repertoire and HIV control.

Proceedings of the National Academy of Sciences of the United States of America·2020
Same author

Imbalance of Regulatory and Cytotoxic SARS-CoV-2-Reactive CD4<sup>+</sup> T Cells in COVID-19.

Cell·2020
Same author

Comparison of HLA ligand elution data and binding predictions reveals varying prediction performance for the multiple motifs recognized by HLA-DQ2.5.

Immunology·2020
Same author

Key Parameters of Tumor Epitope Immunogenicity Revealed Through a Consortium Approach Improve Neoantigen Prediction.

Cell·2020
Same author

Interferon-γ Release Assay for Accurate Detection of Severe Acute Respiratory Syndrome Coronavirus 2 T-Cell Response.

Clinical infectious diseases : an official publication of the Infectious Diseases Society of America·2020
Same author

Cross-reactive memory T cells and herd immunity to SARS-CoV-2.

Nature reviews. Immunology·2020
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Automating literature review for the Immune Epitope Database using Naïve Bayes classifiers significantly speeds up the process. This approach maintains high accuracy, classifying 51.1% of abstracts with 95% sensitivity and specificity.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Immunoinformatics

Background:

  • The Immune Epitope Database (IEDB) manually curates immune epitope information from scientific literature.
  • Identifying relevant articles for curation is a time-consuming bottleneck in knowledge base development.

Purpose of the Study:

  • To automate the article relevance classification process for the IEDB.
  • To improve the efficiency of manual curation by reducing the number of abstracts requiring expert review.

Main Methods:

  • Developed and trained Naïve Bayes classifiers on 20,910 expert-classified abstracts.
  • Enhanced classifier performance by incorporating PubMed metadata, applying feature selection, and extracting domain-specific patterns (e.g., peptide sequences).
  • Integrated the classifier into the curation workflow to categorize abstracts as relevant, irrelevant, or uncertain.

Related Experiment Videos

Main Results:

  • The automated classification achieved 95% sensitivity and specificity on 51.1% of abstracts.
  • The system successfully identified clearly relevant and irrelevant abstracts, flagging uncertain cases for manual review.

Conclusions:

  • Text classification accelerates reference selection for the IEDB without compromising accuracy.
  • The study offers practical guidance for text classification tool users and provides a benchmark dataset for tool developers.