Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Genome Annotation and Assembly

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.

Next-generation Sequencing

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features.

Oligosaccharide Assembly

Oligosaccharide Assembly

Protein glycosylation starts in the ER lumen and continues in the Golgi apparatus. Glycosyltransferases catalyze the addition of sugar molecules or glycosylation of proteins. Usually, these enzymes add sugars to the hydroxyl groups of selected serine or threonine residues to form O-linked glycans or the amino groups of asparagine residues to form N-linked glycans. Different positions on the same polypeptide chain can contain differently linked glycans.
Multiple sugar molecules that may or may...

Sanger Sequencing

Sanger Sequencing

DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...

Genome-wide Association Studies-GWAS

Genome-wide Association Studies-GWAS

Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...

RACE - Rapid Amplification of cDNA Ends

RACE - Rapid Amplification of cDNA Ends

Rapid Amplification of cDNA Ends, or RACE, is one of the most effective methods to obtain a full-length cDNA from an mRNA sequence between a known internal region to the unknown sequence at the 5’ or 3’ end. The unknown region is cloned in the cDNA by a gene-specific primer that binds the known end, and a hybrid primer that attaches a predefined anchor sequence to the unknown end of the cDNA. The sequence in between is amplified by PCR with an anchor primer and a gene-specific primer.
Since the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Effectiveness of SARS-CoV-2 testing strategies in reducing COVID-19 cases, hospitalisations, and deaths.

The Cochrane database of systematic reviews·2025

Same author

Deep Intronic SVA_E Insertion Identified as the Most Common Pathogenic Variant Associated With Canavan Disease: A Diagnostic Blind Spot.

Neurology. Genetics·2025

Same author

Depletion of extracellular asparagine impairs self-reactive T cells and ameliorates autoimmunity in a murine model of multiple sclerosis.

bioRxiv : the preprint server for biology·2025

Same author

Effectiveness of SARS-CoV-2 testing strategies: A scoping review.

Cochrane evidence synthesis and methods..·2025

Same author

A mechanically resilient soft hydrogel improves drug delivery for treating post-traumatic osteoarthritis in physically active joints.

Proceedings of the National Academy of Sciences of the United States of America·2025

Same author

Effectiveness of SARS-CoV-2 testing strategies.

The Cochrane database of systematic reviews·2025

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 22, 2026

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Published on: May 9, 2017

Gossamer--a resource-efficient de novo assembler.

Thomas Conway¹, Jeremy Wazny, Andrew Bromage

¹NICTA Victoria Research Laboratory, Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria 3010, Australia. tom.conway@nicta.com.au

Bioinformatics (Oxford, England)

|May 22, 2012

Summary

This summary is machine-generated.

Gossamer is a new tool for de novo assembly of short-read sequencing data. It efficiently produces high-quality genome assemblies with minimal memory usage.

More Related Videos

Novel Sequence Discovery by Subtractive Genomics

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

Related Experiment Videos

Last Updated: May 22, 2026

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Published on: May 9, 2017

Novel Sequence Discovery by Subtractive Genomics

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

Area of Science:

Genomics
Bioinformatics

Background:

De novo assembly of short-read high-throughput sequencing data presents significant computational challenges due to large data volumes, small read lengths, and sequencing errors.
Existing short-read assemblers are often limited to smaller genomes or require substantial computing resources, frequently yielding suboptimal results.
Many current assembly algorithms employ greedy approaches, which can compromise the quality of the final genome assembly.

Purpose of the Study:

To develop a novel computational tool for efficient and high-quality de novo assembly of short-read sequencing data.
To address the memory and computational limitations of existing genome assembly software.
To provide a space-efficient and effective solution for assembling large and complex genomes.

Main Methods:

Implementation of the de Bruijn graph approach for sequence assembly.
Development of Gossamer, a software package designed for efficient processing of large sequencing datasets.
Optimization of memory usage to approach theoretical minimums for de novo assembly.

Main Results:

Gossamer demonstrates significant space efficiency, requiring minimal memory for operation.
The software enables efficient processing of high-throughput sequencing data.
Gossamer produces high-quality genome assemblies, outperforming existing methods in terms of accuracy and completeness.

Conclusions:

Gossamer offers a computationally efficient and memory-sparing solution for de novo genome assembly.
The tool is capable of generating high-quality assemblies from short-read sequencing data.
Gossamer provides a valuable resource for researchers dealing with large-scale genomic data analysis.