Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Genome Annotation and Assembly

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.

Next-generation Sequencing

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features.

Nucleic Acid Structure

Nucleic Acid Structure

The pentose sugar in DNA is deoxyribose, while in RNA the pentose sugar is ribose. The difference between the sugars is the presence of the hydroxyl group on the ribose's second carbon and a hydrogen on the deoxyribose's second carbon. The phosphate residue attaches to the hydroxyl group of the 5′ carbon of one sugar and the hydroxyl group of the 3′ carbon of the sugar of the next nucleotide, which forms a 5′ to 3′ phosphodiester linkage.
DNA Structure
DNA has a double-helix structure. The...

DNA Microarrays

DNA Microarrays

Microarrays are high-throughput and relatively inexpensive assays that can be automated to analyze large quantities of data at a time. They are used in genome-wide studies to compare gene or protein expression under two varied conditions, such as healthy and diseased states. Microarrays consist of glass or silica slides on which probe molecules are covalently attached through surface functionalization. Most commonly, the slides are prepared through the chemisorption of silanes to silica...

Complementary DNA

Complementary DNA

Complementary DNA

Complementary DNA

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

<i>Special Issue:</i> 13th International Conference on Computational Advances in Bio and Medical Sciences.

Journal of computational biology : a journal of computational molecular cell biology·2026

Same author

Fast Algorithms for Computing Jaro Similarity.

Journal of computational biology : a journal of computational molecular cell biology·2026

Same author

Potent Acridone Antimalarial against All Three Life Stages of <i>Plasmodium</i>.

Research square·2025

Same author

<i>Special Section:</i> 12th International Computational Advances in Bio and Medical Sciences (ICCABS 2023).

Journal of computational biology : a journal of computational molecular cell biology·2025

Same author

Randomized feature selection based semi-supervised latent Dirichlet allocation for microbiome analysis.

Scientific reports·2024

Same author

KE: A Knowledge Enhancing Framework for Machine Learning Models.

The journal of physical chemistry. A·2023

Same journal

Biomedical Concept Recognition with Error-aware Negative-enhanced Ranking Framework.

Bioinformatics (Oxford, England)·2026

Same journal

TEDLH: Domain HMMs for sensitive detection of remote homologues.

Bioinformatics (Oxford, England)·2026

Same journal

PLNFGL: Joint Estimation of Multi-Condition Gene Networks from Single-cell RNA-seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

MCFST: Spatial domain identification method based on multi-view graph convolutional network and graph fusion network.

Bioinformatics (Oxford, England)·2026

Same journal

SpaBiT: Enhancing Spatial Transcriptomics Resolution via Bidirectional Attention Transformers.

Bioinformatics (Oxford, England)·2026

Same journal

EDEL: Enhancing Dense Retrievers for Curation of Biomedical Knowledge Bases.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 1, 2026

Design and Synthesis of a Reconfigurable DNA Accordion Rack

Design and Synthesis of a Reconfigurable DNA Accordion Rack

Published on: August 15, 2018

A memory-efficient data structure representing exact-match overlap graphs with application for next-generation DNA

Hieu Dinh¹, Sanguthevar Rajasekaran

¹Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA. hdinh@engr.uconn.edu

Bioinformatics (Oxford, England)

|June 4, 2011

Summary

This summary is machine-generated.

This study introduces a novel, compact data structure for exact-match overlap graphs, significantly reducing memory and time requirements for DNA assembly. The new structure enables efficient handling of massive datasets generated by next-generation sequencing technologies.

More Related Videos

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Published on: August 27, 2014

Related Experiment Videos

Last Updated: Jun 1, 2026

Design and Synthesis of a Reconfigurable DNA Accordion Rack

Design and Synthesis of a Reconfigurable DNA Accordion Rack

Published on: August 15, 2018

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Collection and Extraction of Saliva DNA for Next Generation Sequencing

Published on: August 27, 2014

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

Exact-match overlap graphs are crucial for DNA assembly and shortest superstring problems.
Traditional methods struggle with the large scale (billions of strings) and memory demands (Ω(n^2)) of next-generation sequencing data.
Existing DNA assemblers face significant time and space limitations.

Purpose of the Study:

To propose a novel data structure for compactly storing exact-match overlap graphs.
To develop efficient algorithms for constructing this data structure.
To address the memory and time inefficiencies of current DNA assembly approaches.

Main Methods:

Definition of maximal exact-match overlap and exact-match overlap graphs with a given threshold (λ).
Development of a compact data structure requiring O(λℓn) or O(λℓn log n) time for construction.
Implementation of algorithms with linear memory requirements.

Main Results:

The proposed data structure uses at most (2λ-1)(2⌈logn⌉+⌈logλ⌉)n bits.
Edge access time is guaranteed to be O(log λ).
Two construction algorithms achieve O(λℓn) and O(λℓn log n) time complexities with linear memory usage.
Experimental results confirm efficient construction on large simulated datasets.

Conclusions:

The novel data structure offers a significant improvement in memory and time efficiency for constructing exact-match overlap graphs.
This advancement is critical for handling the massive datasets in modern DNA sequencing and assembly.
The developed DNA sequence assembler incorporating the data structure is available for use.