Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

Evolution of Microbial Genome

Evolution of Microbial Genome

Microbial genome evolution is a highly dynamic process shaped by continual gene gain and loss across species and strains. This genomic flexibility allows microorganisms to adapt rapidly to environmental pressures and interactions with other organisms. Central to understanding this diversity is the distinction between the core and pan genomes.The core genome comprises the genes shared by all sampled strains of a species, representing essential functions needed for fundamental cellular processes.

Multi-species Conserved Sequences

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved DNA...

Gene Evolution - Fast or Slow?

Gene Evolution - Fast or Slow?

The genomes of eukaryotes are punctuated by long stretches of sequence which do not code for proteins or RNAs. Although some of these regions do contain crucial regulatory sequences, the vast majority of this DNA serves no known function. Typically, these regions of the genome are the ones in which the fastest change, in evolutionary terms, is observed, because there is typically little to no selection pressure acting on these regions to preserve their sequences.
In contrast, regions which code...

Genome Size and the Evolution of New Genes

Genome Size and the Evolution of New Genes

While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.

Genome Size and the Evolution of New Genes

Genome Size and the Evolution of New Genes

While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Knowledge-augmented pre-trained language models for biomedical relation extraction.

BMC bioinformatics·2025

Same author

Explaining care need assessment surveys: qualitative and quantitative evaluation of state-of-the-art local and global explainable artificial intelligence methods.

JAMIA open·2025

Same author

Senescence-associated lineage-aberrant plasticity evokes T-cell-mediated tumor control.

Nature communications·2025

Same author

Global overview of usable Landsat and Sentinel-2 data for 1982-2023.

Data in brief·2024

Same author

HunFlair2 in a cross-corpus evaluation of biomedical named entity recognition and normalization tools.

Bioinformatics (Oxford, England)·2024

Same author

BELHD: improving biomedical entity linking with homonym disambiguation.

Bioinformatics (Oxford, England)·2024

Same journal

Haplotype-aware long-read error correction.

Algorithms for molecular biology : AMB·2026

Same journal

Extension of partial atom-to-atom maps: uniqueness and algorithms.

Algorithms for molecular biology : AMB·2026

Same journal

Lossless pangenome indexing using tag arrays.

Algorithms for molecular biology : AMB·2026

Same journal

Dolphyin: a combinatorial algorithm for identifying 1-Dollo phylogenies in cancer.

Algorithms for molecular biology : AMB·2026

Same journal

Probing transcription factor subsets in gene regulatory networks.

Algorithms for molecular biology : AMB·2026

Same journal

Comparing the ability of embedding methods on metabolic hypergraphs for capturing taxonomy-based features.

Algorithms for molecular biology : AMB·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 16, 2026

In Vitro Selection of Aptamers to Differentiate Infectious from Non-Infectious Viruses

In Vitro Selection of Aptamers to Differentiate Infectious from Non-Infectious Viruses

Published on: September 7, 2022

Adaptive efficient compression of genomes.

Sebastian Wandelt¹, Ulf Leser

¹Institute for Computer Science, Humboldt-Universität zu Berlin, Berlin, Germany. wandelt@informatik.hu-berlin.de.

Algorithms for Molecular Biology : AMB

|November 14, 2012

Summary

This summary is machine-generated.

High-throughput DNA sequencing generates massive data. This study introduces an adaptive, parallel referential compression method that efficiently compresses genomic data, offering a tunable balance between memory usage and speed.

More Related Videos

Introductory Analysis and Validation of CUT&RUN Sequencing Data

Introductory Analysis and Validation of CUT&RUN Sequencing Data

Published on: December 13, 2024

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Related Experiment Videos

Last Updated: May 16, 2026

In Vitro Selection of Aptamers to Differentiate Infectious from Non-Infectious Viruses

In Vitro Selection of Aptamers to Differentiate Infectious from Non-Infectious Viruses

Published on: September 7, 2022

Introductory Analysis and Validation of CUT&RUN Sequencing Data

Introductory Analysis and Validation of CUT&RUN Sequencing Data

Published on: December 13, 2024

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Area of Science:

Genomics
Bioinformatics
Computational Biology

Background:

High-throughput sequencing technologies generate DNA data at an unprecedented rate.
The increasing volume of sequence data presents significant computational challenges for analysis and storage.
Existing referential compression algorithms often require substantial memory and exhibit slow run times.

Purpose of the Study:

To develop an adaptive, parallel, and efficient referential sequence compression method.
To enable fine-tuning of the trade-off between memory requirements and compression speed.
To address the computational challenges posed by large-scale genomic data.

Main Methods:

An adaptive referential compression approach utilizing parallel processing.
Implementation of a method allowing adjustable memory-compression speed trade-offs.
Benchmarking against state-of-the-art algorithms on human genome datasets.

Main Results:

Achieved compression ratios on par with leading algorithms (400:1) using only 12 MB of memory for human genomes.
Demonstrated significantly faster compression times, compressing a complete human genome in 11 seconds with 9 GB of memory.
Outperformed competitors by nearly three times in speed while utilizing less main memory.

Conclusions:

The proposed method offers a highly efficient solution for compressing large genomic datasets.
It provides a flexible approach to manage computational resources for sequence data analysis.
This advancement is crucial for managing the ever-increasing data output from modern sequencing technologies.