Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Genome Annotation and Assembly

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based...

Next-generation Sequencing

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features.

RACE - Rapid Amplification of cDNA Ends

RACE - Rapid Amplification of cDNA Ends

Rapid Amplification of cDNA Ends, or RACE, is one of the most effective methods to obtain a full-length cDNA from an mRNA sequence between a known internal region to the unknown sequence at the 5’ or 3’ end. The unknown region is cloned in the cDNA by a gene-specific primer that binds the known end, and a hybrid primer that attaches a predefined anchor sequence to the unknown end of the cDNA. The sequence in between is amplified by PCR with an anchor primer and a gene-specific primer.
Since the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

MaizeField3D: A curated 3D point cloud and procedural model dataset of field-grown maize from a diversity panel.

Plant phenomics (Washington, D.C.)·2026

Same author

Enhancing yield prediction from plot-level satellite imagery through genotype and environment feature disentanglement.

Frontiers in plant science·2025

Same author

Crop growth model-enabled genetic mapping of biomass accumulation dynamics in photoperiod-sensitive sorghum.

The plant genome·2025

Same author

Disambiguating a Soft Metagenomic Clustering.

Journal of computational biology : a journal of computational molecular cell biology·2025

Same author

SCEMENT: scalable and memory efficient integration of large-scale single-cell RNA-sequencing data.

Bioinformatics (Oxford, England)·2025

Same author

Moisture-responsive root-branching pathways identified in diverse maize breeding germplasm.

Science (New York, N.Y.)·2025

Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026

Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026

Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 25, 2026

Novel Sequence Discovery by Subtractive Genomics

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Parallel short sequence assembly of transcriptomes.

Benjamin G Jackson¹, Patrick S Schnable, Srinivas Aluru

¹Department of Electrical and Computer Engineering, Iowa State University, Ames, IA 50011, USA. zbbrox@iastate.edu

BMC Bioinformatics

|February 12, 2009

Summary

This summary is machine-generated.

We developed a novel parallel method for transcriptome assembly from large short sequence data. This approach efficiently handles complex genomic data, enabling rapid and scalable analysis of large datasets.

More Related Videos

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Published on: May 9, 2017

Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA

Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA

Published on: October 27, 2011

Related Experiment Videos

Last Updated: Jun 25, 2026

Novel Sequence Discovery by Subtractive Genomics

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Published on: May 9, 2017

Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA

Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA

Published on: October 27, 2011

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

De novo assembly of genomes and transcriptomes from short sequences presents significant computational challenges.
Existing methods often rely on smaller datasets (BACs, prokaryotic genomes) due to complexity.

Purpose of the Study:

To present a parallel method for efficient transcriptome assembly from large short sequence data.
To address the computational and space complexity of short sequence assembly.

Main Methods:

Utilized a graph theoretic framework and parallel computing.
Constructed a distributed bidirected graph to capture sequence overlaps.
Employed parallel list ranking for contig compaction and graph processing to resolve repeats.

Main Results:

Successfully assembled 925 million sequences (40 billion nucleotides) from a synthetic dataset.
Achieved assembly in minutes on a 1024-processor Blue Gene/L system.
Demonstrated scalability to large problem sizes.

Conclusions:

The presented method is the first fully distributed approach for assembling non-hierarchical short sequence data.
The method is validated using a synthetic dataset from Zea mays coding regions.
The approach offers a scalable solution for large-scale transcriptome assembly.