Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

A graph-search framework for associating gene identifiers with documents.

William W Cohen1, Einat Minkov

  • 1Department of Machine Learning, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA. wcohen@cs.cmu.edu

BMC Bioinformatics
|October 13, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Social world knowledge: Modeling and applications.

PloS one·2023
Same author

From genome to phenome: Predicting multiple cancer phenotypes based on somatic genomic alterations via the genomic impact transformer.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2019
Same author

Automatic Human-like Mining and Constructing Reliable Genetic Association Database with Deep Reinforcement Learning.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2019
Same author

Quantifying the web browser ecosystem.

PloS one·2017
Same author

Structured Correspondence Topic Models for Mining Captioned Figures in Biological Literature.

KDD : proceedings. International Conference on Knowledge Discovery & Data Mining·2014
Same author

Information Extraction as Link Prediction: Using Curated Citation Networks to Improve Gene Detection.

WASA ... : International Conference on Wireless Algorithms, Systems, and Applications : proceedings. WASA·2011
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Combining multiple gene named entity recognition (NER) systems improves gene identifier ranking accuracy. A graph-based approach combined with learning enhances performance, outperforming individual NER systems for model organism database curation.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Model organism database curation requires identifying gene identifiers from scientific articles.
  • A semi-automated approach involves ranking potential gene identifiers for each article.

Purpose of the Study:

  • To compare methods for ranking gene identifiers from text.
  • To evaluate baseline approaches, a graph-based method, and a learning-based reranking method.

Main Methods:

  • Comparing named entity recognition (NER) systems with a soft dictionary.
  • Implementing a graph-based method combining multiple NER outputs and other information sources.
  • Applying a learning method to rerank the graph-based method's output.

Main Results:

Related Experiment Videos

  • NER systems with similar F-measure can differ significantly in gene identifier ranking.
  • The graph-based approach surpasses individual NER systems, with further gains from learning.
  • Entity-level F1 performance does not reliably predict NER utility for gene identifier finding.

Conclusions:

  • Gene identifier ranking systems benefit from combining multiple NER systems.
  • Learning-based reranking can improve graph-based gene identifier ranking performance.
  • Accurate gene identifier ranking systems can be built using available resources without specialized components.