Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

7.2K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
7.2K
Next-generation Sequencing03:00

Next-generation Sequencing

100.4K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
100.4K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

13.5K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
13.5K
Comparing Copy Number Variations and SNPs02:26

Comparing Copy Number Variations and SNPs

19.0K
Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...
19.0K
RNA-seq03:21

RNA-seq

12.4K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
12.4K
Sanger Sequencing01:57

Sanger Sequencing

777.2K
DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...
777.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Reduction of next-generation sequencing files for efficient pharmacogenotype extraction.

Pharmacogenomics·2026
Same author

LOCALE: Local-Alignment Embeddings for Noise-Robust DNA Search at SRA Scale.

bioRxiv : the preprint server for biology·2026
Same author

Identifying Robust Subclonal Structures through Tumor Progression Tree Alignment.

bioRxiv : the preprint server for biology·2026
Same author

LCPAN: efficient variation graph construction using locally consistent parsing.

Genome biology·2026
Same author

Genomes from 117 vertebrate species reveal rapidly evolving segmental-duplication landscapes.

Genome biology and evolution·2026
Same author

Shechi: A Secure Distributed Computation Compiler Based on Multiparty Homomorphic Encryption.

Proceedings. UNIX Security Symposium·2026
Same journal

ClairS: a deep-learning method for long-read tumor-normal pair somatic small variant calling.

Nature methods·2026
Same journal

RNAbpFlow: base pair-augmented SE(3) flow matching for conditional RNA 3D structure generation.

Nature methods·2026
Same journal

Spatio-DARLIN enables robust and efficient in situ lineage tracing in mice at single-cell resolution.

Nature methods·2026
Same journal

EasyGrid: a versatile platform for automated cryo-EM sample preparation and quality control.

Nature methods·2026
Same journal

Cloud-based microscope enables live neuroimaging for 24 h and beyond with worldwide access.

Nature methods·2026
Same journal

Deep molecular profiling in three dimensions.

Nature methods·2026
See all related articles

Related Experiment Video

Updated: Mar 13, 2026

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance
04:58

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Published on: December 13, 2024

4.6K

Comparison of high-throughput sequencing data compression tools.

Ibrahim Numanagić1, James K Bonfield2, Faraz Hach1,3

  • 1School of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada.

Nature Methods
|November 8, 2016
PubMed
Summary
This summary is machine-generated.

High-throughput sequencing (HTS) data compression methods were benchmarked. This study evaluated various techniques to significantly reduce the large file sizes of FASTQ and SAM formats, essential for managing growing genomic datasets.

More Related Videos

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Published on: May 9, 2017

9.7K
Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets
06:40

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Published on: February 23, 2024

1.8K

Related Experiment Videos

Last Updated: Mar 13, 2026

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance
04:58

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Published on: December 13, 2024

4.6K
Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Published on: May 9, 2017

9.7K
Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets
06:40

Author Spotlight: Cost-Effective Transcriptomic Drug Screening - Unlocking New Targets

Published on: February 23, 2024

1.8K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • High-throughput sequencing (HTS) generates vast amounts of data, commonly stored in FASTQ (raw reads) or SAM (mapped reads) formats.
  • The exponential growth of HTS data presents significant storage and management challenges due to large file sizes.
  • Efficient data compression is crucial for reducing the memory footprint of genomic datasets.

Purpose of the Study:

  • To conduct a comprehensive benchmarking study of existing HTS data compression methods.
  • To evaluate the effectiveness of various compression techniques on diverse HTS datasets.
  • To provide insights into optimal compression strategies for genomic data storage.

Main Methods:

  • Development of an automated framework for systematic evaluation.
  • Testing a comprehensive set of HTS data, including both raw (FASTQ) and mapped (SAM) formats.
  • Benchmarking performance based on compression ratios and computational efficiency.

Main Results:

  • Identification of compression methods that significantly reduce HTS data size.
  • Comparative analysis of different compression algorithms' performance across various datasets.
  • Quantification of memory footprint reduction achieved by different methods.

Conclusions:

  • Specific compression methods demonstrate superior performance in reducing HTS data size.
  • The findings guide the selection of appropriate compression tools for efficient genomic data management.
  • Optimized compression strategies are vital for handling the increasing volume of HTS data.