Jove
Visualize
Contact Us

Related Concept Videos

Genome Annotation and Assembly03:36

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
Sanger Sequencing01:57

Sanger Sequencing

DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...
Fixing Double-strand Breaks02:04

Fixing Double-strand Breaks

The double-stranded structure of DNA has two major advantages. First, it serves as a safe repository of genetic information where one strand serves as the back-up in case the other strand is damaged. Second, the double-helical structure can be wrapped around proteins called histones to form nucleosomes, which can then be tightly wound to form chromosomes. This way, DNA chains up to 2 inches long can be contained within microscopic structures in a cell. A double-stranded break not only damages...
Fixing Double-strand Breaks02:04

Fixing Double-strand Breaks

The double-stranded structure of DNA has two major advantages. First, it serves as a safe repository of genetic information where one strand serves as the back-up in case the other strand is damaged. Second, the double-helical structure can be wrapped around proteins called histones to form nucleosomes, which can then be tightly wound to form chromosomes. This way, DNA chains up to 2 inches long can be contained within microscopic structures in a cell. A double-stranded break not only damages...
Next-generation Sequencing03:00

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features.
Long-patch Base Excision Repair01:02

Long-patch Base Excision Repair

Since the discovery of the two BER pathways, there has been a debate about how a cell chooses one pathway over the other and the factors determining this selection. Numerous in vitro experiments have pointed out multiple determinants for the sub-pathway selection. These are:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

GA4GH: International policies and standards for data sharing across genomic research and healthcare.

Cell genomics·2022
Same author

CRAM 3.1: advances in the CRAM file format.

Bioinformatics (Oxford, England)·2022
Same author

HTSlib: C library for reading/writing high-throughput sequencing data.

GigaScience·2021
Same author

Twelve years of SAMtools and BCFtools.

GigaScience·2021
Same author

Crumble: reference free lossy compression of sequence quality values.

Bioinformatics (Oxford, England)·2018
Same author

De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms.

Scientific reports·2017
Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026
Same journal

KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.

Bioinformatics (Oxford, England)·2026
Same journal

IDR searcher: a search engine solution for public image resources.

Bioinformatics (Oxford, England)·2026
Same journal

KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.

Bioinformatics (Oxford, England)·2026
Same journal

Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.

Bioinformatics (Oxford, England)·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
See all related articles
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Jun 12, 2026

Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Gap5--editing the billion fragment sequence assembly.

James K Bonfield1, Andrew Whitwham

  • 1Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK. jkb@sanger.ac.uk

Bioinformatics (Oxford, England)
|June 2, 2010
PubMed
Summary
This summary is machine-generated.

Gap5 software addresses challenges in DNA sequence assembly by efficiently handling large datasets. This scalable solution demonstrates robust performance in processing billions of sequence fragments.

More Related Videos

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies
12:08

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

Efficient PAM-Less Base Editing for Zebrafish Modeling of Human Genetic Disease with zSpRY-ABE8e
07:31

Efficient PAM-Less Base Editing for Zebrafish Modeling of Human Genetic Disease with zSpRY-ABE8e

Published on: February 17, 2023

Related Experiment Videos

Last Updated: Jun 12, 2026

Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies
12:08

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

Efficient PAM-Less Base Editing for Zebrafish Modeling of Human Genetic Disease with zSpRY-ABE8e
07:31

Efficient PAM-Less Base Editing for Zebrafish Modeling of Human Genetic Disease with zSpRY-ABE8e

Published on: February 17, 2023

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Modern DNA sequencing generates vast amounts of data, overwhelming existing assembly editors.
  • Efficiently managing and assembling large-scale genomic datasets is a significant challenge.

Purpose of the Study:

  • To introduce Gap5, a scalable software for DNA sequence assembly.
  • To present the data structures and algorithms enabling Gap5's scalability.
  • To evaluate Gap5's performance and resource utilization.

Main Methods:

  • Gap5 is implemented in C and Tcl/Tk.
  • It utilizes specific data structures and algorithms for scalability.
  • Performance is benchmarked against other assembly programs using large datasets.

Main Results:

  • Gap5 successfully assembled 1.1 billion sequence fragments.
  • Analysis of memory, CPU, and I/O usage demonstrated Gap5's efficiency.
  • File sizes were also analyzed in comparison to other software.

Conclusions:

  • Gap5 offers a scalable solution for assembling massive DNA sequence datasets.
  • The software's design facilitates efficient handling of next-generation sequencing data.
  • Gap5 is available as open-source software within the Staden Package.