Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome Annotation and Assembly03:36

Genome Annotation and Assembly

20.4K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
20.4K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.8K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.8K
RNA-seq03:21

RNA-seq

11.6K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
11.6K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

12.5K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
12.5K
Next-generation Sequencing03:00

Next-generation Sequencing

97.5K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
97.5K
Sanger Sequencing01:57

Sanger Sequencing

772.6K
DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...
772.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Population-scale Y chromosome assemblies reveal recurrent remodeling within constrained architectures.

bioRxiv : the preprint server for biology·2026
Same author

Subclonal IDH1/2 Mutations as a Targetable Vulnerability in Vascular Tumors.

bioRxiv : the preprint server for biology·2026
Same author

A complete human pancreatic cancer genome.

bioRxiv : the preprint server for biology·2026
Same author

Characterization of the urinary DNA virome of hematopoietic stem cell transplant recipient and healthy cynomolgus macaques.

bioRxiv : the preprint server for biology·2026
Same author

Fully Phased Telomere-to-Telomere Assemblies for Thoroughbred Horse and Donkey Haplotypes derived from a Mule Illuminate the Peculiar Evolution of Equid Centromeres.

bioRxiv : the preprint server for biology·2026
Same author

The complete genome of the KOLF2.1J reference iPSC line.

bioRxiv : the preprint server for biology·2026
Same journal

Integrated lipidomic and transcriptomic profiling of the host response in human malaria.

Genome biology·2026
Same journal

Centromeric satellite expansion drives genome evolution in the snowy owl.

Genome biology·2026
Same journal

Mapping the landscape of allele-specific expression in porcine genomes.

Genome biology·2026
Same journal

Genomic sequence evolution underlying human neocortical interareal diversification.

Genome biology·2026
Same journal

Regulatory mechanisms driven by functional 3'-UTR variants in alcohol use disorder and related traits.

Genome biology·2026
Same journal

A longitudinal single-nucleus transcriptomic atlas of bovine placentation reveals dynamic cellular hierarchies and regulatory programs.

Genome biology·2026
See all related articles

Related Experiment Video

Updated: Jan 4, 2026

High-throughput Identification of Gene Regulatory Sequences Using Next-generation Sequencing of Circular Chromosome Conformation Capture 4C-seq
09:06

High-throughput Identification of Gene Regulatory Sequences Using Next-generation Sequencing of Circular Chromosome Conformation Capture 4C-seq

Published on: October 5, 2018

10.7K

Mash Screen: high-throughput sequence containment estimation for genome discovery.

Brian D Ondov1,2, Gabriel J Starrett3, Anna Sappington4

  • 1Genome Informatics section, National Human Genome Research Institute, Bethesda, MD, USA. brian.ondov@nih.gov.

Genome Biology
|November 7, 2019
PubMed
Summary
This summary is machine-generated.

We developed a new algorithm to accurately measure genome containment in metagenomes. This tool aids in contamination screening and discovering novel viral species from sequencing data.

Keywords:
MetagenomicsMinHashPolyomavirusSRASequencingViral Discovery

More Related Videos

Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

9.1K
G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome
06:40

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

Published on: March 22, 2018

6.2K

Related Experiment Videos

Last Updated: Jan 4, 2026

High-throughput Identification of Gene Regulatory Sequences Using Next-generation Sequencing of Circular Chromosome Conformation Capture 4C-seq
09:06

High-throughput Identification of Gene Regulatory Sequences Using Next-generation Sequencing of Circular Chromosome Conformation Capture 4C-seq

Published on: October 5, 2018

10.7K
Novel Sequence Discovery by Subtractive Genomics
09:40

Novel Sequence Discovery by Subtractive Genomics

Published on: January 25, 2019

9.1K
G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome
06:40

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

Published on: March 22, 2018

6.2K

Area of Science:

  • Genomics
  • Bioinformatics
  • Metagenomics

Background:

  • MinHash algorithms efficiently estimate genome similarity but struggle with accurate containment analysis.
  • Estimating genome containment is crucial for metagenomic analysis, contamination screening, and novel genome discovery.

Purpose of the Study:

  • To introduce a novel online algorithm for precise measurement of genome and proteome containment within sequencing read sets.
  • To enable reliable contamination screening and retrospective analysis for novel genome discovery in metagenomic datasets.

Main Methods:

  • Developed an online algorithm to calculate genome and proteome containment.
  • Applied the algorithm to estimate containment of all NCBI RefSeq genomes within all SRA metagenomes.
  • Utilized the tool for contamination assessment and novel genome identification.

Main Results:

  • The algorithm accurately measures genome containment in both assembled and unassembled sequencing reads.
  • Provided comprehensive containment estimates across the NCBI RefSeq and SRA databases.
  • Successfully identified a previously unknown polyomavirus species from a public metagenome.

Conclusions:

  • The developed algorithm offers a robust solution for genome containment estimation in metagenomics.
  • This tool enhances the capability for microbial community analysis, contamination detection, and discovery of novel biological entities.
  • Facilitates large-scale retrospective analysis of existing metagenomic data for new insights.