Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Genomics

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...

DNA Microarrays

DNA Microarrays

Microarrays are high-throughput and relatively inexpensive assays that can be automated to analyze large quantities of data at a time. They are used in genome-wide studies to compare gene or protein expression under two varied conditions, such as healthy and diseased states. Microarrays consist of glass or silica slides on which probe molecules are covalently attached through surface functionalization. Most commonly, the slides are prepared through the chemisorption of silanes to silica...

Genome Annotation and Assembly

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.

Comparing Copy Number Variations and SNPs

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...

Genetic Variation

Genetic Variation

Genetic variation is the diversity in DNA sequences found among individuals of the same species. This diversity is crucial for a species' survival because it helps organisms adapt to environmental changes. Genetic variation begins with fertilization, where an egg and sperm cell merge. Each of these cells carries 23 chromosomes, up to 46 in the fertilized egg. Chromosomes are long DNA strands that contain genes, the basic units of heredity.
Genes exist in different versions called alleles, which...

Gene Duplication and Divergence

Gene Duplication and Divergence

The seminal work of Ohno in 1970 popularized the idea of gene duplication and divergence. DNA sequence comparison studies reveal that a large portion of the genes in bacteria, archaebacteria, and eukaryotes was generated by gene duplication and divergence, indicating its critical role in evolution.
The duplicated copies of the gene are called Paralogs. Paralogs with similar sequences and functions form a gene family. Across several species, a large number of gene families are characterized.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Improving protein-protein interaction article classification using biological domain knowledge.

International journal of data mining and bioinformatics·2015

Same author

An ensemble self-training protein interaction article classifier.

Bio-medical materials and engineering·2013

Same author

BioLMiner System: interaction normalization task and interaction pair task in the BioCreative II.5 challenge.

IEEE/ACM transactions on computational biology and bioinformatics·2010

Same author

Estrogen receptor neurobiology and its potential for translation into broad spectrum therapeutics for CNS disorders.

Current molecular pharmacology·2009

Same author

Transcriptional and post-translational regulation of adiponectin.

The Biochemical journal·2009

Same author

Contact mechanics and elastohydrodynamic lubrication in a novel metal-on-metal hip implant with an aspherical bearing surface.

Journal of biomechanics·2009

Same journal

DiffGRN: differential gene regulatory network analysis.

International journal of data mining and bioinformatics·2019

Same journal

Integration of multi-omics data for integrative gene regulatory network inference.

International journal of data mining and bioinformatics·2018

Same journal

The development of non-coding RNA ontology.

International journal of data mining and bioinformatics·2016

Same journal

Learning multiple distributed prototypes of semantic categories for named entity recognition.

International journal of data mining and bioinformatics·2015

Same journal

Weighted fusion regularisation and predicting microbial interactions with vector autoregressive model.

International journal of data mining and bioinformatics·2015

Same journal

Application of consensus string matching in the diagnosis of allelic heterogeneity involving transposition mutation.

International journal of data mining and bioinformatics·2015

See all related articles

Search research articles

Related Experiment Video

Updated: May 25, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A machine learning-based system to normalise gene mentions to unique database identifiers.

Yifei Chen¹, Feng Liu, Bernard Manderick

¹School of Information Sciences, Nanjing Audit University, Nanjing 211815, China. yifeichen91@nau.edu.cn

International Journal of Data Mining and Bioinformatics

|February 3, 2012

Summary

This summary is machine-generated.

We developed a Gene Normalizer (GNer) to assign unique IDs to gene mentions in scientific literature. This tool uses machine learning and rules to improve gene identification accuracy.

More Related Videos

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Related Experiment Videos

Last Updated: May 25, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Area of Science:

Bioinformatics
Computational Biology
Natural Language Processing in Biology

Background:

Accurate identification of gene mentions in biological literature is crucial for knowledge extraction.
Existing methods face challenges with gene name variations and ambiguities.

Purpose of the Study:

To develop an integrated Gene Normalizer (GNer) for assigning unique database identifiers to gene mentions.
To improve the accuracy and reduce ambiguity in gene name recognition within biological texts.

Main Methods:

Constructing a dictionary from EntrezGene and BioThesaurus.
Implementing a pre-processor to reduce synonym variations and ambiguities.
Utilizing Support Vector Machines (SVMs) and rule-based components for disambiguation.

Main Results:

The Gene Normalizer (GNer) achieved a precision of 80.5%.
The system demonstrated a recall of 86.4%.
An F(beta=1) measure of 83.4% indicates strong overall performance.

Conclusions:

The proposed Gene Normalizer (GNer) effectively assigns unique identifiers to gene mentions.
The integrated approach combining SVMs and rule-based methods enhances gene recognition accuracy.
GNer offers a valuable tool for biological literature analysis and knowledge discovery.