Related Concept Videos
Genome Annotation and Assembly
Sanger Sequencing
Fixing Double-strand Breaks
Fixing Double-strand Breaks
Next-generation Sequencing
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features.
Long-patch Base Excision Repair
You might also read
Related Articles
Articles linked to this work by shared authors, journal, and citation graph.
GA4GH: International policies and standards for data sharing across genomic research and healthcare.
Crumble: reference free lossy compression of sequence quality values.
De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms.
3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.
KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.
IDR searcher: a search engine solution for public image resources.
KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.
Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.
conMItion: an R package adjusting confounding factors for associations in multi-omics.
Related Experiment Video
Updated: Jun 12, 2026

Novel Sequence Discovery by Subtractive Genomics
Published on: January 25, 2019
Gap5--editing the billion fragment sequence assembly.
James K Bonfield1, Andrew Whitwham
1Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK. jkb@sanger.ac.uk
Gap5 software addresses challenges in DNA sequence assembly by efficiently handling large datasets. This scalable solution demonstrates robust performance in processing billions of sequence fragments.
More Related Videos
12:08Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies
Published on: August 20, 2021
07:31Efficient PAM-Less Base Editing for Zebrafish Modeling of Human Genetic Disease with zSpRY-ABE8e
Published on: February 17, 2023
Area of Science:
- Bioinformatics
- Computational Biology
- Genomics
Background:
- Modern DNA sequencing generates vast amounts of data, overwhelming existing assembly editors.
- Efficiently managing and assembling large-scale genomic datasets is a significant challenge.
Purpose of the Study:
- To introduce Gap5, a scalable software for DNA sequence assembly.
- To present the data structures and algorithms enabling Gap5's scalability.
- To evaluate Gap5's performance and resource utilization.
Main Methods:
- Gap5 is implemented in C and Tcl/Tk.
- It utilizes specific data structures and algorithms for scalability.
- Performance is benchmarked against other assembly programs using large datasets.
Main Results:
- Gap5 successfully assembled 1.1 billion sequence fragments.
- Analysis of memory, CPU, and I/O usage demonstrated Gap5's efficiency.
- File sizes were also analyzed in comparison to other software.
Conclusions:
- Gap5 offers a scalable solution for assembling massive DNA sequence datasets.
- The software's design facilitates efficient handling of next-generation sequencing data.
- Gap5 is available as open-source software within the Staden Package.