Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome Annotation and Assembly03:36

Genome Annotation and Assembly

The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
Genomics02:02

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
Prokaryotic Gene Structure and Organization01:28

Prokaryotic Gene Structure and Organization

Prokaryotic genomes exhibit a streamlined organization of coding and non-coding regions essential for gene expression and protein synthesis. While coding regions contain the genetic instructions for proteins or functional RNAs, non-coding regions regulate the precise transcription and translation of these genes.Coding Regions: Proteins and RNAsThe primary coding regions, known as structural genes, include sequences transcribed into messenger RNA (mRNA) and ultimately translated into...
Genomic DNA in Eukaryotes00:58

Genomic DNA in Eukaryotes

Eukaryotes have large genomes compared to prokaryotes. To fit their genomes into a cell, eukaryotic DNA is packaged extraordinarily tightly inside the nucleus. To achieve this, DNA is tightly wound around proteins called histones, which are packaged into nucleosomes that are joined by linker DNA and coil into chromatin fibers. Additional fibrous proteins further compact the chromatin, which is recognizable as chromosomes during certain phases of cell division.
Organization of Genes02:07

Organization of Genes

Overview

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Accelerating String Comparison in RLZ Compressed Sequences via LCE Jumps.

bioRxiv : the preprint server for biology·2026
Same author

RAmpSim: A Thermodynamic Simulator for Hybridization Capture in Metagenomic Sequencing.

bioRxiv : the preprint server for biology·2025
Same author

Toward security-aware portable sequencing.

Nature communications·2025
Same author

Enriched Long-Read Sequencing of Co-circulating Viruses in Complex Samples.

Molecular biology and evolution·2025
Same author

Long-read reconstruction of many diverse haplotypes with devider.

Genome research·2025
Same author

Robust 16S rRNA classification based on a compressed LCA index.

Genome research·2025
Same journal

A unified analysis of cell type- and trajectory-associated pathways in single-cell data using Phoenix.

Genome research·2026
Same journal

Resf1 is required for proper placental development and configuration of trophoblast cell-specific heterochromatin.

Genome research·2026
Same journal

Telomere-driven replicative crisis is driven by large-scale changes in genomic architecture.

Genome research·2026
Same journal

Spatially informed reference-free cell-type deconvolution for spatial transcriptomics with SpatialCD.

Genome research·2026
Same journal

Spatially resolved profiling of steroid nuclear receptors reveals a role for the disordered N-terminal domains in genome targeting and AP-1 interaction.

Genome research·2026
Same journal

Flexible and scalable inference of spatially varying correlation in spatial transcriptomics with spCorr.

Genome research·2026
See all related articles

Related Experiment Video

Updated: May 17, 2026

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Building genomic data structures from compressed representations using prefix-free parsing.

Rahul Varki1, Christina Boucher2

  • 1Department of Computer and Information Science and Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida 32611, USA rvarki@ufl.edu.

Genome Research
|May 15, 2026
PubMed
Summary
This summary is machine-generated.

Prefix-free parsing (PFP) enables bioinformatics tools to handle massive genome datasets by compressing repetitive text. This allows essential data structures to be built from compressed data, overcoming memory limitations for large-scale pangenomics.

More Related Videos

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies
12:08

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

Metagenomic Analysis of Silage
08:43

Metagenomic Analysis of Silage

Published on: January 13, 2017

Related Experiment Videos

Last Updated: May 17, 2026

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies
12:08

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

Metagenomic Analysis of Silage
08:43

Metagenomic Analysis of Silage

Published on: January 13, 2017

Area of Science:

  • Bioinformatics
  • Genomics
  • Computational Biology

Background:

  • High-throughput sequencing enables large pangenomic datasets, exceeding petabyte scale.
  • Traditional bioinformatics tools struggle with memory limitations on these massive datasets.
  • A need exists for methods that process data directly from compressed representations.

Purpose of the Study:

  • To survey prefix-free parsing (PFP) as a solution for handling large-scale genomic data.
  • To explain the core principles and applications of PFP.
  • To outline future research directions in PFP for bioinformatics.

Main Methods:

  • Prefix-free parsing (PFP) as a preprocessing technique.
  • Compression of repetitive text within large datasets.
  • Construction of data structures directly from compressed PFP output.

Main Results:

  • PFP compresses repetitive text efficiently.
  • Enables the construction of essential data structures from compressed data.
  • Addresses memory limitations in traditional bioinformatics tools for large datasets.

Conclusions:

  • PFP is a crucial method for managing and analyzing large-scale pangenomic data.
  • It overcomes memory constraints by operating on compressed representations.
  • Further research can expand PFP's applications in bioinformatics.