Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

The Nucleosome01:19

The Nucleosome

3.3K
Human DNA is almost two meters long. However, it is compressed inside a tiny nucleus measuring only a few microns in diameter. To make this degree of compaction possible, DNA is organized into several sequential levels so that it can fit into such a tiny space. The most compact form of DNA is a chromosome that can be seen under a microscope in a dividing cell.
In a chromosome, DNA is wound twice around a protein complex called a histone octamer core, which consists of 8 histone proteins. This...
3.3K
Gene Evolution - Fast or Slow?02:05

Gene Evolution - Fast or Slow?

7.8K
The genomes of eukaryotes are punctuated by long stretches of sequence which do not code for proteins or RNAs. Although some of these regions do contain crucial regulatory sequences, the vast majority of this DNA serves no known function. Typically, these regions of the genome are the ones in which the fastest change, in evolutionary terms, is observed, because there is typically little to no selection pressure acting on these regions to preserve their sequences.
In contrast, regions which code...
7.8K
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

4.5K
Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...
4.5K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.7K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.7K
Complementary DNA01:44

Complementary DNA

30.6K
Overview
30.6K
Next-generation Sequencing03:00

Next-generation Sequencing

96.3K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
96.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

FALCON2: compression-based metagenomic classification of ancient viruses.

Bioinformatics (Oxford, England)·2026
Same author

An evaluation of computational methods for reconstruction of human viral DNA genomes.

GigaScience·2025
Same author

JARVIS3: an efficient encoder for genomic data.

Bioinformatics (Oxford, England)·2024
Same author

Machine Learning-Driven Discovery and Database of Cyanobacteria Bioactive Compounds: A Resource for Therapeutics and Bioremediation.

Journal of chemical information and modeling·2024
Same author

AltaiR: a C toolkit for alignment-free and temporal analysis of multi-FASTA data.

GigaScience·2024
Same author

Intra-host genomic diversity and integration landscape of human tissue-resident DNA virome.

Nucleic acids research·2024
Same journal

NanoporeDB: A Structural Resource Of Multimeric Protein Nanopores For Single-Molecule Sensing.

GigaScience·2026
Same journal

From the Brain Cell Atlas to Precision Neurology: A review of the application of AI-driven multi-omics in brain science.

GigaScience·2026
Same journal

Comparison of Deep Learning Approaches for Extreme Low-SNR Image Restoration.

GigaScience·2026
Same journal

ScopeViewer: A Browser-Based Solution for Visualizing Large Biological Images.

GigaScience·2026
Same journal

ChatMDV: Reducing Technical Barriers in Bioinformatics Analysis using Large Language Models.

GigaScience·2026
Same journal

ClusterGraph: a new tool for visualisation and compression of multidimensional data.

GigaScience·2026
See all related articles

Related Experiment Video

Updated: Nov 30, 2025

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data
09:34

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

4.3K

Efficient DNA sequence compression with neural networks.

Milton Silva1,2, Diogo Pratas1,2,3, Armando J Pinho1,2

  • 1Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal.

Gigascience
|November 12, 2020
PubMed
Summary
This summary is machine-generated.

GeCo3 enhances DNA sequence compression using neural networks, outperforming previous tools. This new genomic sequence compressor offers significant improvements for storing and analyzing vast amounts of genetic data.

Keywords:
DNA sequence compressioncontext mixinglossless data compressionmixture of expertsneural networks

More Related Videos

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.5K
DNA-Tethered RNA Polymerase for Programmable In vitro Transcription and Molecular Computation
09:26

DNA-Tethered RNA Polymerase for Programmable In vitro Transcription and Molecular Computation

Published on: December 29, 2021

4.6K

Related Experiment Videos

Last Updated: Nov 30, 2025

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data
09:34

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

4.3K
Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches
09:47

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

1.5K
DNA-Tethered RNA Polymerase for Programmable In vitro Transcription and Molecular Computation
09:26

DNA-Tethered RNA Polymerase for Programmable In vitro Transcription and Molecular Computation

Published on: December 29, 2021

4.6K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • The exponential growth of genomic data necessitates efficient lossless compression methods for storage and analysis.
  • Existing neural network approaches for DNA compression lag behind specialized tools like GeCo2.
  • A gap exists in models specifically designed for the unique characteristics of DNA sequences.

Purpose of the Study:

  • To develop a novel genomic sequence compressor, GeCo3, that integrates neural networks with DNA-specific models.
  • To improve upon the compression efficiency of existing state-of-the-art DNA compressors.
  • To leverage neural networks for enhanced context modeling in DNA sequence compression.

Main Methods:

  • Developed GeCo3, a genomic sequence compressor utilizing neural networks for mixing multiple context models.
  • Incorporated substitution-tolerant context models tailored for DNA sequences.
  • Benchmarked GeCo3 on diverse datasets including human genomes, viral genomes, and ancient DNA, both in reference-free and reference-based modes.

Main Results:

  • GeCo3 demonstrated significant compression ratio improvements over GeCo2, ranging from 2.4% to 7.1% in reference-free compression.
  • In reference-based compression, GeCo3 outperformed the state-of-the-art by 10.1% to 12.4%.
  • While computationally 1.7-3 times slower than GeCo2, GeCo3 maintains constant RAM usage and scales efficiently with sequence size.

Conclusions:

  • GeCo3 represents a significant advancement in genomic sequence compression, offering substantial gains through its neural network-based mixing approach.
  • The portable mixing method allows for easy integration into other data compression or analysis tools.
  • GeCo3 is freely available under GPLv3, promoting its adoption in the scientific community.