Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Evaluation of gene structure prediction programs

M Burset1, R Guigó

  • 1Departament d'Informàtica Mèdica, Institut Municipal d'Investigació Mèdica (IMIM), Barcelona, E-08003, Spain.

Genomics
|June 15, 1996
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Gene expression profiles in prostate cancer: identification of candidate non-invasive diagnostic markers.

Actas urologicas espanolas·2013
Same author

Improving data and knowledge management to better integrate health care and research.

Journal of internal medicine·2013
Same author

SPIn: model selection for phylogenetic mixtures via linear invariants.

Molecular biology and evolution·2011
Same author

Long noncoding RNAs as enhancers of gene expression.

Cold Spring Harbor symposia on quantitative biology·2011
Same author

Selenoprofiles: profile-based scanning of eukaryotic genome sequences for selenoprotein genes.

Bioinformatics (Oxford, England)·2010
Same author

Exon structure conservation despite low sequence similarity: a relic of dramatic events in evolution?

The EMBO journal·2001
Same journal

Integrating transcriptomics and metabolomics reveals the molecular landscape of sperm maturation driven by regional differentiation in the epididymis of Guizhou-Guiqian semi-fine wool sheep.

Genomics·2026
Same journal

Impact of genotype on histopathology and clinical characters in a Chinese cohort with obstructive hypertrophic cardiomyopathy.

Genomics·2026
Same journal

A novel reusable transcriptome-wide association study workflow used to map key genes linked to important cattle traits.

Genomics·2026
Same journal

The large mitochondrial genome of Syndiclis anlungensis (Lauraceae): Genome structure, comparative analysis, and phylogenetic relationships with other Syndiclis species.

Genomics·2026
Same journal

DeepGEP: Deep learning for gene expression prediction from multi-omics in mammals.

Genomics·2026
Same journal

Molecular features of external Auditory Canal cholesteatoma by microbial metagenomic sequencing.

Genomics·2026
See all related articles

Computer programs for gene identification show limited accuracy, especially on novel sequences. Protein database searches improve predictions, but current tools struggle with complex genomic DNA structures.

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Computational gene identification is crucial for genome projects, particularly with large-scale sequencing.
  • Accurate prediction of protein-coding genes in genomic DNA is essential for understanding gene function and regulation.

Purpose of the Study:

  • To evaluate the current performance of computer programs for predicting protein-coding genes in genomic DNA.
  • To identify the most effective computational approaches for gene identification and assess their limitations.

Main Methods:

  • Uniform testing of gene prediction programs on a large set of vertebrate DNA sequences with simple gene structures.
  • Calculation of predictive accuracy at nucleotide, exon, and protein product levels.
  • Inclusion of protein sequence database searches as a factor in program performance.

Related Experiment Videos

Main Results:

  • Predictive accuracy was lower than previously reported, particularly for novel sequences lacking similarity to existing data.
  • Most programs achieved a Correlation Coefficient between 0.60 and 0.70, with less than 50% of exons identified accurately.
  • Programs incorporating protein sequence database searches demonstrated substantially higher accuracy.
  • High rates of sequence errors significantly impacted program performance.

Conclusions:

  • Current gene prediction programs are overly reliant on training data and struggle with complex genomic structures.
  • While useful for identifying potential exon regions, existing tools are insufficient for complete genomic structure elucidation.
  • Further development is needed to improve the accuracy and robustness of computational gene identification methods for large, uncharacterized genomic sequences.