Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Gene Evolution - Fast or Slow?02:05

Gene Evolution - Fast or Slow?

7.2K
The genomes of eukaryotes are punctuated by long stretches of sequence which do not code for proteins or RNAs. Although some of these regions do contain crucial regulatory sequences, the vast majority of this DNA serves no known function. Typically, these regions of the genome are the ones in which the fastest change, in evolutionary terms, is observed, because there is typically little to no selection pressure acting on these regions to preserve their sequences.
In contrast, regions which code...
7.2K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

5.9K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
5.9K
Genetic Variation01:25

Genetic Variation

340
Genetic variation is the diversity in DNA sequences found among individuals of the same species. This diversity is crucial for a species' survival because it helps organisms adapt to environmental changes. Genetic variation begins with fertilization, where an egg and sperm cell merge. Each of these cells carries 23 chromosomes, up to 46 in the fertilized egg. Chromosomes are long DNA strands that contain genes, the basic units of heredity.
Genes exist in different versions called alleles,...
340
Gene Conversion02:08

Gene Conversion

9.8K
Other than maintaining genome stability via DNA repair, homologous recombination plays an important role in diversifying the genome. In fact, the recombination of sequences forms the molecular basis of genomic evolution. Random and non-random permutations of genomic sequences create a library of new amalgamated sequences. These newly formed genomes can determine the fitness and survival of cells. In bacteria, homologous and non-homologous types of recombination lead to the evolution of new...
9.8K
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

4.0K
Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...
4.0K
Gene Duplication and Divergence02:37

Gene Duplication and Divergence

6.2K
The seminal work of Ohno in 1970 popularized the idea of gene duplication and divergence. DNA sequence comparison studies reveal that a large portion of the genes in bacteria, archaebacteria, and eukaryotes was  generated by gene duplication and divergence, indicating its critical role in evolution.
The duplicated copies of the gene are called Paralogs. Paralogs with similar sequences and functions form a gene family. Across several species, a large number of gene families are...
6.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Computational Nutrition in Practice: Challenges and Opportunities From an Early-Career Perspective.

The Journal of nutrition·2026
Same author

The P4D Dashboard: A Platform for Monitoring Clinical Studies.

Studies in health technology and informatics·2025
Same author

Quantized Inverse Design for Photonic Integrated Circuits.

ACS omega·2025
Same author

HiCMC: High-Efficiency Contact Matrix Compressor.

BMC bioinformatics·2024
Same author

Genie: the first open-source ISO/IEC encoder for genomic data.

Communications biology·2024
Same author

Validation of the predictive value of BDNF -87 methylation for antidepressant treatment success in severely depressed patients-a randomized rater-blinded trial.

Trials·2024
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Aug 5, 2025

Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency
18:10

Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency

Published on: June 16, 2011

29.6K

GVC: efficient random access compression for gene sequence variations.

Yeremia Gunawan Adhisantoso1, Jan Voges2, Christian Rohlfing3

  • 1Institut für Informationsverarbeitung and L3S Research Center, Leibniz University Hannover, Hannover, Germany. adhisant@tnt.uni-hannover.de.

BMC Bioinformatics
|March 28, 2023
PubMed
Summary
This summary is machine-generated.

Genomic Variant Codec (GVC) offers superior compression for gene sequence variations, reducing data size by 21% while maintaining random access. This innovation aids efficient storage and remote data integration for large genomic datasets.

Keywords:
CompressionRandom accessVCFVariants

More Related Videos

Rare Event Detection Using Error-corrected DNA and RNA Sequencing
10:36

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

12.1K
Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease
09:34

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease

Published on: April 4, 2018

33.9K

Related Experiment Videos

Last Updated: Aug 5, 2025

Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency
18:10

Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency

Published on: June 16, 2011

29.6K
Rare Event Detection Using Error-corrected DNA and RNA Sequencing
10:36

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

12.1K
Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease
09:34

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease

Published on: April 4, 2018

33.9K

Area of Science:

  • Genomics
  • Bioinformatics
  • Data Compression

Background:

  • High-throughput sequencing generates vast amounts of genomic data, crucial for fields like precision medicine and oncology.
  • Identifying gene sequence variations is key to understanding phenotypic variations in studies like genome-wide association studies.
  • Current data storage methods struggle to keep pace with the rapid growth of genomic information.

Purpose of the Study:

  • To introduce a novel approach for compressing gene sequence variations.
  • To enable random access to compressed genomic variation data.
  • To improve the efficiency of storing and accessing large-scale genomic datasets.

Main Methods:

  • Developed the Genomic Variant Codec (GVC), a novel compression algorithm.
  • Utilized binarization and joint row- and column-wise sorting of variation blocks.
  • Incorporated the JBIG image compression standard for efficient entropy coding.

Main Results:

  • GVC achieved a significant reduction in genotype data size, from 758 GiB to 890 MiB on the 1000 Genomes Project data.
  • This represents a 21% smaller data size compared to existing random-access capable methods.
  • GVC demonstrated a superior trade-off between compression efficiency and random access capability.

Conclusions:

  • GVC facilitates efficient storage of large gene sequence variation collections.
  • The random access feature enables seamless remote data access and application integration.
  • The GVC software is open-source, promoting accessibility and further development.