Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genomics02:02

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
DNA Microarrays02:34

DNA Microarrays

Microarrays are high-throughput and relatively inexpensive assays that can be automated to analyze large quantities of data at a time. They are used in genome-wide studies to compare gene or protein expression under two varied conditions, such as healthy and diseased states. Microarrays consist of glass or silica slides on which probe molecules are covalently attached through surface functionalization. Most commonly, the slides are prepared through the chemisorption of silanes to silica...
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
Comparing Copy Number Variations and SNPs02:26

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...
Genetic Variation01:25

Genetic Variation

Genetic variation is the diversity in DNA sequences found among individuals of the same species. This diversity is crucial for a species' survival because it helps organisms adapt to environmental changes. Genetic variation begins with fertilization, where an egg and sperm cell merge. Each of these cells carries 23 chromosomes, up to 46 in the fertilized egg. Chromosomes are long DNA strands that contain genes, the basic units of heredity.
Genes exist in different versions called alleles, which...
Gene Duplication and Divergence02:37

Gene Duplication and Divergence

The seminal work of Ohno in 1970 popularized the idea of gene duplication and divergence. DNA sequence comparison studies reveal that a large portion of the genes in bacteria, archaebacteria, and eukaryotes was  generated by gene duplication and divergence, indicating its critical role in evolution.
The duplicated copies of the gene are called Paralogs. Paralogs with similar sequences and functions form a gene family. Across several species, a large number of gene families are characterized.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Improving protein-protein interaction article classification using biological domain knowledge.

International journal of data mining and bioinformatics·2015
Same author

An ensemble self-training protein interaction article classifier.

Bio-medical materials and engineering·2013
Same author

BioLMiner System: interaction normalization task and interaction pair task in the BioCreative II.5 challenge.

IEEE/ACM transactions on computational biology and bioinformatics·2010
Same author

Estrogen receptor neurobiology and its potential for translation into broad spectrum therapeutics for CNS disorders.

Current molecular pharmacology·2009
Same author

Transcriptional and post-translational regulation of adiponectin.

The Biochemical journal·2009
Same author

Contact mechanics and elastohydrodynamic lubrication in a novel metal-on-metal hip implant with an aspherical bearing surface.

Journal of biomechanics·2009
Same journal

DiffGRN: differential gene regulatory network analysis.

International journal of data mining and bioinformatics·2019
Same journal

Integration of multi-omics data for integrative gene regulatory network inference.

International journal of data mining and bioinformatics·2018
Same journal

The development of non-coding RNA ontology.

International journal of data mining and bioinformatics·2016
Same journal

Learning multiple distributed prototypes of semantic categories for named entity recognition.

International journal of data mining and bioinformatics·2015
Same journal

Weighted fusion regularisation and predicting microbial interactions with vector autoregressive model.

International journal of data mining and bioinformatics·2015
Same journal

Application of consensus string matching in the diagnosis of allelic heterogeneity involving transposition mutation.

International journal of data mining and bioinformatics·2015
See all related articles

Related Experiment Video

Updated: May 25, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A machine learning-based system to normalise gene mentions to unique database identifiers.

Yifei Chen1, Feng Liu, Bernard Manderick

  • 1School of Information Sciences, Nanjing Audit University, Nanjing 211815, China. yifeichen91@nau.edu.cn

International Journal of Data Mining and Bioinformatics
|February 3, 2012
PubMed
Summary
This summary is machine-generated.

We developed a Gene Normalizer (GNer) to assign unique IDs to gene mentions in scientific literature. This tool uses machine learning and rules to improve gene identification accuracy.

More Related Videos

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports
07:35

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Related Experiment Videos

Last Updated: May 25, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports
07:35

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Natural Language Processing in Biology

Background:

  • Accurate identification of gene mentions in biological literature is crucial for knowledge extraction.
  • Existing methods face challenges with gene name variations and ambiguities.

Purpose of the Study:

  • To develop an integrated Gene Normalizer (GNer) for assigning unique database identifiers to gene mentions.
  • To improve the accuracy and reduce ambiguity in gene name recognition within biological texts.

Main Methods:

  • Constructing a dictionary from EntrezGene and BioThesaurus.
  • Implementing a pre-processor to reduce synonym variations and ambiguities.
  • Utilizing Support Vector Machines (SVMs) and rule-based components for disambiguation.

Main Results:

  • The Gene Normalizer (GNer) achieved a precision of 80.5%.
  • The system demonstrated a recall of 86.4%.
  • An F(beta=1) measure of 83.4% indicates strong overall performance.

Conclusions:

  • The proposed Gene Normalizer (GNer) effectively assigns unique identifiers to gene mentions.
  • The integrated approach combining SVMs and rule-based methods enhances gene recognition accuracy.
  • GNer offers a valuable tool for biological literature analysis and knowledge discovery.