Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Sanger Sequencing01:57

Sanger Sequencing

801.1K
DNA sequencing is a fundamental technique that is routinely used in the biological sciences. This method can be applied to a range of questions at different scales - from the sequencing of a cloned DNA fragment or the study of a mutation in a gene up to whole-genome sequencing. However, despite the widespread use of sequencing today, it was not until 1977 that Fredrick Sanger and his collaborators developed the chain-termination method to decode DNA sequences. It relies on the separation of a...
801.1K
RNA-seq03:21

RNA-seq

9.4K
RNA sequencing, or RNA-Seq, is a high-throughput sequencing technology used to study the transcriptome of a cell. Transcriptomics helps to interpret the functional elements of a genome and identify the molecular constituents of an organism. Additionally, it also helps in understanding the development of an organism and the occurrence of diseases. 
Before the discovery of RNA-seq, microarray-based methods and Sanger sequencing were used for transcriptome analysis. However, while...
9.4K
Next-generation Sequencing03:00

Next-generation Sequencing

88.0K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
88.0K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

5.9K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
5.9K
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

10.5K
In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
10.5K
Downsampling01:20

Downsampling

873
When considering a sampled sequence with zero values between sampling instants, one can replace it by taking every N-th value of the sequence. At these integer multiples of N, the original and sampled sequences coincide. This process, known as decimation, involves extracting every N-th sample from a sequence, thereby creating a more efficient sequence.
The Fourier transform of the decimated sequence reveals a combination of scaled and shifted versions of the original spectrum. This...
873

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Additive-driven microwave crystallization of tyramine polymorphs and salts: a quantum crystallography perspective. Corrigendum.

IUCrJ·2026
Same author

Reference-free discovery with barcoded single-cell sequencing.

Nature biotechnology·2026
Same author

FunctionaL Assigning Sequence Homing (FLASH) maps phenotype to sequence with deep and machine learning.

bioRxiv : the preprint server for biology·2026
Same author

Fast and accurate multiple-protein-sequence alignment at scale with FAMSA2.

Nature biotechnology·2026
Same author

A Reference-Free Algorithm Discovers Regulation in the Plant Transcriptome.

Plant direct·2026
Same author

MDCompress: better, faster compression of molecular dynamics simulation trajectories.

Bioinformatics (Oxford, England)·2026
Same journal

Haplotype-aware long-read error correction.

Algorithms for molecular biology : AMB·2026
Same journal

Extension of partial atom-to-atom maps: uniqueness and algorithms.

Algorithms for molecular biology : AMB·2026
Same journal

Lossless pangenome indexing using tag arrays.

Algorithms for molecular biology : AMB·2026
Same journal

Dolphyin: a combinatorial algorithm for identifying 1-Dollo phylogenies in cancer.

Algorithms for molecular biology : AMB·2026
Same journal

Probing transcription factor subsets in gene regulatory networks.

Algorithms for molecular biology : AMB·2026
Same journal

Comparing the ability of embedding methods on metabolic hypergraphs for capturing taxonomy-based features.

Algorithms for molecular biology : AMB·2026
See all related articles

Related Experiment Video

Updated: May 5, 2026

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance
04:58

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Published on: December 13, 2024

3.8K

Data compression for sequencing data.

Sebastian Deorowicz1, Szymon Grabowski2

  • 1Institute of Informatics, Silesian University of Technology, Gliwice, Poland.

Algorithms for Molecular Biology : AMB
|November 21, 2013
PubMed
Summary
This summary is machine-generated.

High-throughput sequencing generates massive data, necessitating data compression for efficient storage and processing. This review explores the critical role and pervasive applications of compression techniques in computational biology.

More Related Videos

Metagenomic Analysis of Silage
08:43

Metagenomic Analysis of Silage

Published on: January 13, 2017

18.7K
Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Published on: May 9, 2017

9.9K

Related Experiment Videos

Last Updated: May 5, 2026

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance
04:58

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Published on: December 13, 2024

3.8K
Metagenomic Analysis of Silage
08:43

Metagenomic Analysis of Silage

Published on: January 13, 2017

18.7K
Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved Non-model Organisms

Published on: May 9, 2017

9.9K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Next-generation sequencing (NGS) technologies generate vast amounts of data, posing significant storage and computational challenges.
  • Effective data management strategies are crucial for the advancement of genomic research and personalized medicine.

Purpose of the Study:

  • To quantitatively address the necessity of data compression for sequencing data.
  • To explain the fundamental principles, data types, formats, and algorithms involved in sequencing data compression.
  • To highlight the widespread and often surprising applications of compression in computational biology.

Main Methods:

  • Review of existing literature on data compression algorithms and tools relevant to biological data.
  • Analysis of sequencing data types and formats (e.g., FASTQ, BAM).
  • Comparative assessment of specialized compression algorithms and software.

Main Results:

  • Demonstration of the quantitative need for compression due to the scale of sequencing data.
  • Description of core compression concepts and their application to diverse biological data.
  • Comparison of various compression tools, highlighting their strengths and weaknesses for specific data types.

Conclusions:

  • Data compression is indispensable for managing the data deluge from modern sequencing.
  • Understanding compression principles is vital for efficient bioinformatics workflows.
  • Compression techniques are fundamental to numerous computational biology applications, extending beyond simple data storage.