Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Learning statistical models for annotating proteins with function information using biomedical text.

Soumya Ray1, Mark Craven

  • 1Department of Computer Sciences, University of Wisconsin, Madison, Wisconsin 53706, USA. sray@cs.wisc.edu

BMC Bioinformatics
|June 18, 2005
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Gene- and domain-aware calibration increases the clinical utility of variant effect predictors.

Research square·2026
Same author

A graph-based learning approach to predict the effects of gene perturbations on molecular phenotypes.

bioRxiv : the preprint server for biology·2026
Same author

Gene- and domain-aware calibration increases the clinical utility of variant effect predictors.

bioRxiv : the preprint server for biology·2026
Same author

A scalable approach to resolving variants of uncertain significance.

bioRxiv : the preprint server for biology·2026
Same author

The IGVF catalog-from genetic variation to function.

Nucleic acids research·2025
Same author

Immune development in urban children and its relationship to environmental exposures, allergic sensitization, and asthma.

The Journal of allergy and clinical immunology·2025
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

This study developed a text mining system for automatic protein annotation using Gene Ontology (GO) codes. The system effectively filtered and ranked annotations, showing improved accuracy with external data.

Area of Science:

  • Biomedical Informatics
  • Computational Biology
  • Text Mining

Background:

  • BioCreative text mining evaluation focused on information extraction from biomedical literature.
  • Task 2 involved automatic annotation of proteins with Gene Ontology (GO) codes using article text.

Purpose of the Study:

  • To develop and evaluate a system for automatic protein annotation with GO codes.
  • To leverage biomedical literature text for evidence-based annotation.

Main Methods:

  • Utilized statistical analyses of full-text articles.
  • Developed n-gram models for GO code hypothesis generation.
  • Employed Naïve Bayes models for filtering and ranking annotations.

Main Results:

Related Experiment Videos

  • System performance was competitive in the BioCreative evaluation.
  • Naïve Bayes models significantly improved annotation accuracy.
  • External data sources enhanced model performance.

Conclusions:

  • The developed system demonstrates strong performance in automatic protein annotation.
  • Statistical text mining approaches, particularly Naïve Bayes, are effective for this task.
  • Integration of external data sources is crucial for accurate biomedical text mining.