Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Comparing Copy Number Variations and SNPs02:26

Comparing Copy Number Variations and SNPs

18.1K
Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...
18.1K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.4K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.4K
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

4.3K
Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved...
4.3K
Genomics02:02

Genomics

37.9K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
37.9K
Gene Duplication and Divergence02:37

Gene Duplication and Divergence

6.9K
The seminal work of Ohno in 1970 popularized the idea of gene duplication and divergence. DNA sequence comparison studies reveal that a large portion of the genes in bacteria, archaebacteria, and eukaryotes was  generated by gene duplication and divergence, indicating its critical role in evolution.
The duplicated copies of the gene are called Paralogs. Paralogs with similar sequences and functions form a gene family. Across several species, a large number of gene families are...
6.9K
Single Nucleotide Polymorphisms-SNPs01:05

Single Nucleotide Polymorphisms-SNPs

16.8K
A single nucleotide polymorphism or SNP is a single nucleotide variation at a specific genomic position in a large population. It is the most prevalent type of sequence variation found in the human genome. Point mutations that occur in more than 1% of the population qualify as SNPs. These are present once every 1000 nucleotides on an average in the human genome. Replacement of a purine with another purine (A/G) or a pyrimidine with another pyrimidine (C/T) is known as a transition. In contrast,...
16.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
Same author

Accelerating String Comparison in RLZ Compressed Sequences via LCE Jumps.

bioRxiv : the preprint server for biology·2026
Same author

Phase 3 Trial of Oral Infigratinib in Children with Achondroplasia.

The New England journal of medicine·2026
Same author

Nanomechanical Sensor Resolving Impulsive Forces below Its Zero-Point Fluctuations.

Physical review letters·2026
Same author

Movi 2: Fast and Space-Efficient Queries on Pangenomes.

Bioinformatics (Oxford, England)·2026
Same author

Trametinib for multiple non-ossifying fibromas due to KRAS mosaic mutations: two case reports.

Communications medicine·2026
Same journal

Faster Maximal Exact Matches with Lazy LCP Evaluation.

Proceedings. Data Compression Conference·2024
Same journal

Recursive Prefix-Free Parsing for Building Big BWTs.

Proceedings. Data Compression Conference·2024
Same journal

Computing matching statistics on Wheeler DFAs.

Proceedings. Data Compression Conference·2024
Same journal

Augmented Thresholds for MONI.

Proceedings. Data Compression Conference·2024
Same journal

CSTs for Terabyte-Sized Data.

Proceedings. Data Compression Conference·2024
Same journal

Denoising of Quality Scores for Boosted Inference and Reduced Storage.

Proceedings. Data Compression Conference·2017
See all related articles

Related Experiment Video

Updated: Oct 13, 2025

Flow-sorting and Exome Sequencing of the Reed-Sternberg Cells of Classical Hodgkin Lymphoma
08:53

Flow-sorting and Exome Sequencing of the Reed-Sternberg Cells of Classical Hodgkin Lymphoma

Published on: June 10, 2017

10.1K

PHONI: Streamed Matching Statistics with Multi-Genome References.

Christina Boucher1, Travis Gagie2, I Tomohiro3

  • 1U Florida Gainesville, USA.

Proceedings. Data Compression Conference
|November 15, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a simplified, streaming algorithm for computing pattern matching statistics in compressed genomic databases. This approach enables efficient parallel processing of large patterns and online analysis for incompressibility detection.

More Related Videos

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

7.3K
Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples
07:30

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples

Published on: June 8, 2020

12.3K

Related Experiment Videos

Last Updated: Oct 13, 2025

Flow-sorting and Exome Sequencing of the Reed-Sternberg Cells of Classical Hodgkin Lymphoma
08:53

Flow-sorting and Exome Sequencing of the Reed-Sternberg Cells of Classical Hodgkin Lymphoma

Published on: June 10, 2017

10.1K
Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

7.3K
Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples
07:30

Optimization for Sequencing and Analysis of Degraded FFPE-RNA Samples

Published on: June 8, 2020

12.3K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Data Compression

Background:

  • Pattern matching in compressed genomic data is computationally challenging.
  • Existing methods require multiple passes and significant memory, limiting scalability.

Purpose of the Study:

  • To develop a simplified, streaming algorithm for pattern matching statistics in compressed genomic databases.
  • To enable efficient parallel processing of large patterns and online analysis.

Main Methods:

  • A novel streaming algorithm that simplifies existing two-pass methods.
  • Implementation of the algorithm for computing matching statistics.

Main Results:

  • The simplified algorithm achieves streaming capabilities, reducing memory requirements.
  • Enables parallel computation of matching statistics for large patterns like human chromosomes.
  • Facilitates online analysis for detecting pattern incompressibility with low latency.

Conclusions:

  • The streaming approach offers a more practical and scalable solution for pattern matching in compressed genomic data.
  • The method allows for efficient real-time analysis and resource management.