Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome Annotation and Assembly03:36

Genome Annotation and Assembly

16.6K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
16.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Confounding factors in assessing the enriched expression of somatic mutant alleles in bulk tumor samples.

Genome research·2026
Same author

Reduced uptake through clathrin down-regulation is associated with resistance to dsRNA in a population of the Colorado potato beetle (Leptinotarsa decemlineata, Say).

Pesticide biochemistry and physiology·2025
Same author

Fast and sensitive detection of targeted gene fusions using frequency minimizers and fuzzy pattern matching with Fuzzion2.

Cell reports methods·2025
Same author

Confounding factors in assessing the enriched expression of somatic mutant allele in bulk tumor samples.

bioRxiv : the preprint server for biology·2025
Same author

The chromosome-scale genome assembly for the West Nile vector Culex quinquefasciatus uncovers patterns of genome evolution in mosquitoes.

BMC biology·2024
Same author

Emergence of broad cytosolic Ca<sup>2+</sup> oscillations in the absence of CRAC channels: A model for CRAC-mediated negative feedback on PLC and Ca<sup>2+</sup> oscillations through PKC.

Journal of theoretical biology·2024
Same journal

In silico analysis, annotation and characterisation of putative ESTs from Sorghum bicolor associated with heat stress.

International journal of bioinformatics research and applications·2015
Same journal

Docking analysis of gallic acid derivatives as HIV-1 protease inhibitors.

International journal of bioinformatics research and applications·2015
Same journal

Automatic segmentation of Potyviridae family polyproteins.

International journal of bioinformatics research and applications·2015
Same journal

Neural network and rough set hybrid scheme for prediction of missing associations.

International journal of bioinformatics research and applications·2015
Same journal

On the interconnection of stable protein complexes: inter-complex hubs and their conservation in Saccharomyces cerevisiae and Homo sapiens networks.

International journal of bioinformatics research and applications·2015
Same journal

Diversity and evolution of the envelope gene of dengue virus type 1 circulating in India in recent times.

International journal of bioinformatics research and applications·2015
See all related articles

Related Experiment Video

Updated: Apr 27, 2026

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

6.7K

Scaling up genome annotation using MAKER and work queue.

Andrew Thrasher1, Zachary Musgrave2, Brian Kachmarck1

  • 1Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, USA.

International Journal of Bioinformatics Research and Applications
|July 4, 2014
PubMed
Summary
This summary is machine-generated.

A new bioinformatics framework accelerates genome annotation by 45x on distributed computing resources. This parallel framework enhances scalability for next-generation sequencing analyses on clusters and clouds.

Keywords:
Caenorhabditis japonicabioinformaticscloud computingclustersdistributed computingexplicit data transfergenome annotationgrid computingnext generation sequencingwork queue

More Related Videos

A Computational Pipeline for Intergenic/Intragenic Enhancer RNA Quantification in Mouse Embryonic Stem Cells
06:02

A Computational Pipeline for Intergenic/Intragenic Enhancer RNA Quantification in Mouse Embryonic Stem Cells

Published on: October 28, 2025

680
A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes
09:10

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes

Published on: May 22, 2018

10.5K

Related Experiment Videos

Last Updated: Apr 27, 2026

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

6.7K
A Computational Pipeline for Intergenic/Intragenic Enhancer RNA Quantification in Mouse Embryonic Stem Cells
06:02

A Computational Pipeline for Intergenic/Intragenic Enhancer RNA Quantification in Mouse Embryonic Stem Cells

Published on: October 28, 2025

680
A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes
09:10

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes

Published on: May 22, 2018

10.5K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Next-generation sequencing (NGS) generates vast amounts of data, increasing demand for efficient bioinformatics analyses.
  • Many bioinformatics applications, including genome annotation, require significant computational resources and can benefit from parallel processing on clusters, clouds, or grids.
  • Existing tools often rely on shared file systems, posing limitations for distributed computing environments.

Purpose of the Study:

  • To develop and evaluate a modified annotation framework for parallel execution of bioinformatics tools on distributed computing resources.
  • To enhance the scalability and efficiency of genome annotation pipelines, specifically addressing limitations of shared file system dependencies.
  • To enable seamless execution of sequence analysis tools across diverse computing infrastructures like clusters, clouds, and grids.

Main Methods:

  • Parallelization of the underlying genome annotation tool (MAKER) as a Message Passing Interface (MPI) application.
  • Modification of the framework to enable execution without MPI, facilitating broader compatibility with distributed resources.
  • Implementation of explicit data transfer mechanisms to overcome shared file system limitations.
  • Evaluation of the framework's performance using a Caenorhabditis japonica test case on a cluster and within the Amazon EC2 cloud environment.

Main Results:

  • Achieved a 45x speed-up in genome annotation using 50 workers on the Caenorhabditis japonica test case.
  • Demonstrated the framework's ability to run efficiently on distributed computing resources, including cloud environments (Amazon EC2).
  • Successfully enabled parallel execution of the annotation tool without MPI, enhancing its applicability.
  • Facilitated explicit data transfer, mitigating issues associated with shared file system dependencies.

Conclusions:

  • The modified annotation framework significantly enhances the speed and scalability of bioinformatics analyses, particularly genome annotation.
  • The framework effectively utilizes distributed computing resources (clusters, clouds, grids) by removing MPI dependency and enabling explicit data transfer.
  • This approach provides a flexible and efficient solution for running sequence analysis tools, even in early development stages, on modern computational infrastructures.