Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

The TREC 2004 genomics track categorization task: classifying full text biomedical documents.

Aaron M Cohen1, William R Hersh

  • 1Department of Medical Informatics and Clinical Epidemiology, School of Medicine, Oregon Health & Science University, 3181 S,W, Sam Jackson Park Road, Mail Code: BICC, Portland, Oregon, 97239-3098, USA. cohenaa@ohsu.edu

Journal of Biomedical Discovery and Collaboration
|May 26, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Context matching is not reasoning when performing generalized clinical evaluation of generative language models.

NPJ digital medicine·2025
Same author

Context Matching is not Reasoning: Assessing Generalized Evaluation of Generative Language Models in Clinical Settings.

Research square·2025
Same author

Automatically pre-screening patients for the rare disease aromatic l-amino acid decarboxylase deficiency using knowledge engineering, natural language processing, and machine learning on a large EHR population.

Journal of the American Medical Informatics Association : JAMIA·2023
Same author

Beyond mathematics, statistics, and programming: data science, machine learning, and artificial intelligence competencies and curricula for clinicians, informaticians, science journalists, and researchers.

Health systems (Basingstoke, England)·2023
Same author

The IMPACT framework and implementation for accessible in silico clinical phenotyping in the digital era.

NPJ digital medicine·2023
Same author

Integrative analysis of drug response and clinical outcome in acute myeloid leukemia.

Cancer cell·2022
Same journal

Two Similarity Metrics for Medical Subject Headings (MeSH): An Aid to Biomedical Text Mining and Author Name Disambiguation.

Journal of biomedical discovery and collaboration·2016
Same journal

The language of discovery.

Journal of biomedical discovery and collaboration·2011
Same journal

Bias associated with mining electronic health records.

Journal of biomedical discovery and collaboration·2011
Same journal

Literature-based Resurrection of Neglected Medical Discoveries.

Journal of biomedical discovery and collaboration·2011
Same journal

A cognitive task analysis of a visual analytic workflow: Exploring molecular interaction networks in systems biology.

Journal of biomedical discovery and collaboration·2011
Same journal

NEMO: Extraction and normalization of organization names from PubMed affiliations.

Journal of biomedical discovery and collaboration·2010
See all related articles

Automated document classification for Gene Ontology (GO) annotation is challenging but crucial for biomedical curation. Future work will refine algorithms for better accuracy in this text mining task.

Area of Science:

  • Biomedical Informatics
  • Computational Biology
  • Text Mining

Background:

  • The TREC 2004 Genomics Track evaluated information retrieval and text mining for genomic data. The focus was on document categorization to aid biomedical curation.
  • The categorization task simulated curators at Mouse Genome Informatics (MGI), involving subtasks for triaging articles and assigning Gene Ontology (GO) categories.

Purpose of the Study:

  • To assess automated document classification for GO annotation.
  • To evaluate systems for triaging articles with experimental evidence for GO terms.
  • To test the assignment of top-level GO categories to relevant documents.

Main Methods:

  • Utilized a document categorization task within the TREC 2004 Genomics Track.
  • Evaluated systems on subtasks including article triage and GO category assignment.

Related Experiment Videos

  • Analyzed performance using utility measures and F-measures.
  • Main Results:

    • The triage subtask showed a mean utility of 0.3303, with top systems not significantly outperforming MeSH term "Mice".
    • Sample coverage of GO terms in the dataset was sparse, suggesting a need for task-specific approaches.
    • The annotation subtask achieved a mean F-measure of 0.3824, with gene name recognition showing benefit.

    Conclusions:

    • Automated GO annotation and evidence code extraction are challenging but offer significant benefits for biomedical curation.
    • Continued research is needed to identify optimal algorithmic features and understand task characteristics for feasible and useful automated classification.
    • The TREC Genomics Track will continue in 2005 with expanded triage tasks and aims to improve 2004 results.