Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome Annotation and Assembly03:36

Genome Annotation and Assembly

19.4K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
19.4K
Next-generation Sequencing03:00

Next-generation Sequencing

93.1K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
93.1K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.2K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.2K
Sanger Sequencing01:57

Sanger Sequencing

759.1K
DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...
759.1K
Genomics02:02

Genomics

37.6K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
37.6K
Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

14.5K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
14.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Space-efficient computation of k-mer dictionaries for large values of k.

Algorithms for molecular biology : AMB·2024
Same author

SAKE: Strobemer-assisted k-mer extraction.

PloS one·2023
Same author

Chaining for accurate alignment of erroneous long reads to acyclic variation graphs.

Bioinformatics (Oxford, England)·2023
Same author

ViQUF: De Novo Viral Quasispecies Reconstruction Using Unitig-Based Flow Networks.

IEEE/ACM transactions on computational biology and bioinformatics·2022
Same author

Extraction of Long k-mers Using Spaced Seeds.

IEEE/ACM transactions on computational biology and bioinformatics·2021
Same author

Space-Efficient Indexing of Spaced Seeds for Accurate Overlap Computation of Raw Optical Mapping Data.

IEEE/ACM transactions on computational biology and bioinformatics·2021
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Sep 24, 2025

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies
12:08

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

5.2K

HGGA: hierarchical guided genome assembler.

Riku Walve1, Leena Salmela2

  • 1Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland.

BMC Bioinformatics
|May 7, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces HGGA, a new framework for genome assembly that uses additional data to cluster reads. HGGA improves genome contiguity and accuracy, outperforming existing methods.

Keywords:
Genetic linkage mapsGenome assembly

More Related Videos

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.4K
RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord
11:13

RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord

Published on: November 1, 2014

14.8K

Related Experiment Videos

Last Updated: Sep 24, 2025

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies
12:08

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

5.2K
Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.4K
RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord
11:13

RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord

Published on: November 1, 2014

14.8K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • De novo genome assembly often results in fragmented contigs, necessitating additional data like genetic linkage maps for complete genome structure resolution.
  • Previous methods primarily focus on ordering and orienting contigs using supplementary genomic data.
  • Challenges remain in achieving highly contiguous and accurate genome assemblies from raw sequencing reads.

Purpose of the Study:

  • To develop a novel framework for guiding de novo genome assembly using auxiliary data.
  • To improve the contiguity and accuracy of genome assemblies by integrating diverse data types.
  • To introduce a computational tool, HGGA, for enhanced genome assembly.

Main Methods:

  • A read-clustering approach is employed, grouping reads originating from proximal genomic locations based on external data.
  • Independent assembly of clustered reads generates initial contigs.
  • Hierarchical assembly of these contigs refines the overall genome structure.
  • Implementation of the framework for genetic linkage maps in the HGGA tool.

Main Results:

  • HGGA significantly enhances genome assembly contiguity, reducing the number of contigs.
  • Experiments show 1.2 to 9.8 times higher NGA50/N50 values compared to plain read assembly.
  • The tool achieves 1.03 to 6.5 times higher NGA50/N50 compared to previous integration methods.
  • Assembly correctness is maintained or improved compared to read-only assembly.

Conclusions:

  • The HGGA framework effectively leverages additional genomic data for superior de novo genome assembly.
  • The tool demonstrates significant improvements in assembly contiguity and accuracy.
  • HGGA offers a robust solution for resolving complex genome structures, particularly with long-read sequencing data.