Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome Annotation and Assembly03:36

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
Next-generation Sequencing03:00

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features.
RNA-seq03:21

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while microarray-based...
Sanger Sequencing01:57

Sanger Sequencing

DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The New York Genome Center ALS Consortium resource integrates postmortem tissue transcriptomics and whole genome sequencing to empower biological discovery.

medRxiv : the preprint server for health sciences·2026
Same author

A complete human pancreatic cancer genome.

bioRxiv : the preprint server for biology·2026
Same author

Lancet2: Improved and accelerated somatic variant calling with joint multi-sample local assembly graphs.

NAR genomics and bioinformatics·2026
Same author

Basic Science and Pathogenesis.

Alzheimer's & dementia : the journal of the Alzheimer's Association·2025
Same author

Comprehensive benchmarking of somatic single-nucleotide variant and indel detection at ultra-low allele fractions using short- and long-read data.

bioRxiv : the preprint server for biology·2025
Same author

Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic.

Nature biotechnology·2025
Same journal

Analysis of strength degradation of coal and rock masses and stability of mined areas under long term immersion environment.

PloS one·2026
Same journal

Biogenic Silver-Selenium nanocomposite with anticancer activity and potent efficacy against vancomycin-resistant Staphylococcus aureus.

PloS one·2026
Same journal

Preparation and physicochemical characterization of a biodegradable chitosan/carboxymethyl cellulose hydrogel synthesized in NaOH/urea medium.

PloS one·2026
Same journal

Action-guilt, survivor-guilt, and depression in combat-related PTSD.

PloS one·2026
Same journal

Explainable machine learning for predicting activities of daily living at discharge in stroke patients: A retrospective study using SHAP interpretability.

PloS one·2026
Same journal

Deep learning based two-way feature depiction model for brain tumor detection.

PloS one·2026
See all related articles

Related Experiment Video

Updated: May 25, 2026

Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Feature-by-feature--evaluating de novo sequence assembly.

Francesco Vezzi1, Giuseppe Narzisi, Bud Mishra

  • 1Department of Mathematics and Informatics, University of Udine, Udine, Italy.

Plos One
|February 10, 2012
PubMed
Summary
This summary is machine-generated.

This study reveals that common whole-genome sequence assembly metrics are insufficient for accurately comparing assembler performance. Multivariate analysis identifies key features for a more reliable assessment, highlighting limitations of simulated data in evaluations.

More Related Videos

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Published on: May 9, 2017

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies
12:08

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

Related Experiment Videos

Last Updated: May 25, 2026

Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Published on: May 9, 2017

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies
12:08

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

Area of Science:

  • Computational Biology
  • Bioinformatics
  • Genomics

Background:

  • Whole-genome sequence assembly (WGSA) is a critical problem in computational biology.
  • Existing tools (assemblers) often claim to solve WGSA, but systematic accuracy comparisons are lacking.
  • Traditional evaluation metrics (e.g., N50) and simulated datasets have limitations in reflecting true assembly quality and correctness.

Purpose of the Study:

  • To systematically analyze the relationships and importance of different features used in evaluating genome assembly quality and correctness.
  • To address the limitations of the Feature Response Curve (FRC) method by accounting for feature correlations.
  • To identify a reduced set of highly informative features for more accurate and reliable assembler performance comparison.

Main Methods:

  • Analysis of feature correlations in whole-genome sequence assembly.
  • Application of multivariate statistical techniques, including Principal Component Analysis (PCA) and Independent Component Analysis (ICA).
  • Utilizing the Feature Response Curve (FRC) method with a refined set of features.

Main Results:

  • Multivariate analysis revealed 'excess-dimensionality' in the feature space and demonstrated the inadequacy of the N50 metric for assessing assembly quality.
  • Independent Component Analysis identified a subset of features that better describe assembler performance.
  • The study confirmed that evaluations based on simulated data can yield unrealistic results.

Conclusions:

  • A reduced set of highly informative features, identified through multivariate analysis, enables a more accurate comparison of genome assemblers using the FRC method.
  • The findings underscore the need for improved evaluation strategies beyond traditional metrics and simulated datasets.
  • This work provides a more robust framework for assessing and comparing whole-genome sequence assembly tools.