Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

7.2K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
7.2K
Phylogenetic Trees03:21

Phylogenetic Trees

51.9K
Phylogenetic trees come in many forms. It matters in which sequence the organisms are arranged from the bottom to the top of the tree, but the branches can rotate at their nodes without altering the information. The lines connecting individual nodes can be straight, angled, or even curved.
51.9K
DNA as a Genetic Template02:05

DNA as a Genetic Template

28.8K
Two structural features of the DNA molecule provide a basis for the mechanisms of heredity: the four nucleotide bases and its double-stranded nature. The Watson-Crick model of double-helical DNA structure, proposed in 1952, drew heavily upon the X-ray crystallography work of researchers Rosalind Franklin and Maurice Wilkins. Watson, Crick, and Wilkins jointly received the Nobel Prize in Physiology or Medicine for their work in 1962. Franklin was, controversially, excluded from the prize for...
28.8K
DNA as a Genetic Template02:05

DNA as a Genetic Template

9.9K
9.9K
Gene Evolution - Fast or Slow?02:05

Gene Evolution - Fast or Slow?

8.4K
The genomes of eukaryotes are punctuated by long stretches of sequence which do not code for proteins or RNAs. Although some of these regions do contain crucial regulatory sequences, the vast majority of this DNA serves no known function. Typically, these regions of the genome are the ones in which the fastest change, in evolutionary terms, is observed, because there is typically little to no selection pressure acting on these regions to preserve their sequences.
In contrast, regions which code...
8.4K
Convergent Evolution01:54

Convergent Evolution

34.5K
Evolution shapes the features of organisms over time, ensuring that they are suited for the environments in which they live. Sometimes, selection pressure leads to the rise of similar but unrelated adaptations in organisms with no recent common ancestors, a process known as convergent evolution.
34.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Development of the authentication and authorization processes for the iAgree portal, a platform for patient-controlled data sharing across health systems.

JAMIA open·2026
Same author

Foundation Model-Guided Synthetic EHR Release: Performance Enhancement with Privacy Preservation.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same author

Large Models for Small Tables: Adapting Tabular Foundation Models to EHR Data.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same author

Memorization in large language models in medicine prevalence characteristics and implications.

Nature communications·2026
Same author

Privacy-enhancing sequential learning under heterogeneous selection bias in multi-site electronic health records data.

Journal of the American Medical Informatics Association : JAMIA·2026
Same author

Exploring patient motivations and preferences for medical data sharing with researchers: a simulation study using the iAgree platform.

Journal of the American Medical Informatics Association : JAMIA·2026
Same journal

Faster Maximal Exact Matches with Lazy LCP Evaluation.

Proceedings. Data Compression Conference·2024
Same journal

Recursive Prefix-Free Parsing for Building Big BWTs.

Proceedings. Data Compression Conference·2024
Same journal

Computing matching statistics on Wheeler DFAs.

Proceedings. Data Compression Conference·2024
Same journal

Augmented Thresholds for MONI.

Proceedings. Data Compression Conference·2024
Same journal

CSTs for Terabyte-Sized Data.

Proceedings. Data Compression Conference·2024
Same journal

PHONI: Streamed Matching Statistics with Multi-Genome References.

Proceedings. Data Compression Conference·2021
See all related articles

Related Experiment Video

Updated: Mar 31, 2026

Analyzing and Building Nucleic Acid Structures with 3DNA
16:24

Analyzing and Building Nucleic Acid Structures with 3DNA

Published on: April 26, 2013

21.4K

An Adaptive Difference Distribution-based Coding with Hierarchical Tree Structure for DNA Sequence Compression.

Wenrui Dai1, Hongkai Xiong2, Xiaoqian Jiang3

  • 1Department of Electronic Engineering Shanghai Jiaotong University Shanghai 200240, China, daiwenrui@sjtu.edu.cn.

Proceedings. Data Compression Conference
|October 27, 2015
PubMed
Summary
This summary is machine-generated.

This study introduces a novel adaptive difference distribution coding framework for DNA sequences, significantly improving compression efficiency. The new method achieves 150% better compression than existing reference-based approaches.

More Related Videos

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin
08:57

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

16.7K
A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles
10:23

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

Published on: July 11, 2025

752

Related Experiment Videos

Last Updated: Mar 31, 2026

Analyzing and Building Nucleic Acid Structures with 3DNA
16:24

Analyzing and Building Nucleic Acid Structures with 3DNA

Published on: April 26, 2013

21.4K
Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin
08:57

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

16.7K
A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles
10:23

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

Published on: July 11, 2025

752

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Data Compression

Background:

  • Existing reference-based DNA sequence compression methods do not fully leverage intrinsic statistical properties.
  • Current approaches often focus solely on approximate matches, limiting compression efficiency.

Purpose of the Study:

  • To propose an adaptive difference distribution-based coding framework for enhanced DNA sequence compression.
  • To improve compression ratios by exploiting nucleotide fragment statistics and hierarchical structures.

Main Methods:

  • Developed an adaptive difference distribution-based coding framework using a hierarchical tree structure for nucleotide fragments.
  • Implemented flexible sub-fragment sizes and matching offsets to concentrate difference sequence distributions.
  • Utilized a Hamming-like weighted distance measure for approximate repeat matching to balance accuracy and overhead.

Main Results:

  • The proposed scheme demonstrates significant compression improvements.
  • Achieved a 150% compression enhancement compared to the leading reference-based compressor, GReEn.
  • Successfully compacted both the difference sequence and auxiliary parameters like sub-fragment size and matching offset.

Conclusions:

  • The adaptive difference distribution-based coding framework offers superior performance for DNA sequence compression.
  • The method effectively balances matching accuracy with the overhead of describing matching parameters.
  • This approach represents a substantial advancement in reference-based DNA compression technology.