Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Sanger Sequencing

Sanger Sequencing

DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...

RNA-seq

RNA-seq

RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases.
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...

Next-generation Sequencing

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

Maxam-Gilbert Sequencing

Maxam-Gilbert Sequencing

In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...

Downsampling

Downsampling

When considering a sampled sequence with zero values between sampling instants, one can replace it by taking every N-th value of the sequence. At these integer multiples of N, the original and sampled sequences coincide. This process, known as decimation, involves extracting every N-th sample from a sequence, thereby creating a more efficient sequence.
The Fourier transform of the decimated sequence reveals a combination of scaled and shifted versions of the original spectrum. This...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Additive-driven microwave crystallization of tyramine polymorphs and salts: a quantum crystallography perspective. Corrigendum.

IUCrJ·2026

Same author

Reference-free discovery with barcoded single-cell sequencing.

Nature biotechnology·2026

Same author

FunctionaL Assigning Sequence Homing (FLASH) maps phenotype to sequence with deep and machine learning.

bioRxiv : the preprint server for biology·2026

Same author

Fast and accurate multiple-protein-sequence alignment at scale with FAMSA2.

Nature biotechnology·2026

Same author

A Reference-Free Algorithm Discovers Regulation in the Plant Transcriptome.

Plant direct·2026

Same author

MDCompress: better, faster compression of molecular dynamics simulation trajectories.

Bioinformatics (Oxford, England)·2026

Same journal

Haplotype-aware long-read error correction.

Algorithms for molecular biology : AMB·2026

Same journal

Extension of partial atom-to-atom maps: uniqueness and algorithms.

Algorithms for molecular biology : AMB·2026

Same journal

Lossless pangenome indexing using tag arrays.

Algorithms for molecular biology : AMB·2026

Same journal

Dolphyin: a combinatorial algorithm for identifying 1-Dollo phylogenies in cancer.

Algorithms for molecular biology : AMB·2026

Same journal

Probing transcription factor subsets in gene regulatory networks.

Algorithms for molecular biology : AMB·2026

Same journal

Comparing the ability of embedding methods on metabolic hypergraphs for capturing taxonomy-based features.

Algorithms for molecular biology : AMB·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 5, 2026

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Published on: December 13, 2024

Data compression for sequencing data.

Sebastian Deorowicz¹, Szymon Grabowski²

¹Institute of Informatics, Silesian University of Technology, Gliwice, Poland.

Algorithms for Molecular Biology : AMB

|November 21, 2013

Summary

This summary is machine-generated.

High-throughput sequencing generates massive data, necessitating data compression for efficient storage and processing. This review explores the critical role and pervasive applications of compression techniques in computational biology.

More Related Videos

Metagenomic Analysis of Silage

Metagenomic Analysis of Silage

Published on: January 13, 2017

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Published on: May 9, 2017

Related Experiment Videos

Last Updated: May 5, 2026

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Published on: December 13, 2024

Metagenomic Analysis of Silage

Metagenomic Analysis of Silage

Published on: January 13, 2017

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Published on: May 9, 2017

Area of Science:

Genomics
Bioinformatics
Computational Biology

Background:

Next-generation sequencing (NGS) technologies generate vast amounts of data, posing significant storage and computational challenges.
Effective data management strategies are crucial for the advancement of genomic research and personalized medicine.

Purpose of the Study:

To quantitatively address the necessity of data compression for sequencing data.
To explain the fundamental principles, data types, formats, and algorithms involved in sequencing data compression.
To highlight the widespread and often surprising applications of compression in computational biology.

Main Methods:

Review of existing literature on data compression algorithms and tools relevant to biological data.
Analysis of sequencing data types and formats (e.g., FASTQ, BAM).
Comparative assessment of specialized compression algorithms and software.

Main Results:

Demonstration of the quantitative need for compression due to the scale of sequencing data.
Description of core compression concepts and their application to diverse biological data.
Comparison of various compression tools, highlighting their strengths and weaknesses for specific data types.

Conclusions:

Data compression is indispensable for managing the data deluge from modern sequencing.
Understanding compression principles is vital for efficient bioinformatics workflows.
Compression techniques are fundamental to numerous computational biology applications, extending beyond simple data storage.