Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Complementary DNA

Complementary DNA

Genome Copying Errors

Genome Copying Errors

DNA replication is a well-evolved process that copies millions of base pairs with high fidelity during each cell division. Occasionally a wrong base or a long stretch of wrong bases may get added to the daughter strands. If the errors are left unchecked, cells might accumulate several mutations that might endanger their survival. Therefore, the copying errors are checked and repaired at three levels.

Genomics

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...

Gene Conversion

Gene Conversion

Gene Conversion

Gene Conversion

Other than maintaining genome stability via DNA repair, homologous recombination plays an important role in diversifying the genome. In fact, the recombination of sequences forms the molecular basis of genomic evolution. Random and non-random permutations of genomic sequences create a library of new amalgamated sequences. These newly formed genomes can determine the fitness and survival of cells. In bacteria, homologous and non-homologous types of recombination lead to the evolution of new...

Next-generation Sequencing

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Additive-driven microwave crystallization of tyramine polymorphs and salts: a quantum crystallography perspective. Corrigendum.

IUCrJ·2026

Same author

FFC: a scalable FASTA compressor.

Bioinformatics (Oxford, England)·2026

Same author

MBGC2: Boosting compression via efficient encoding of approximate matches in genome collections.

GigaScience·2026

Same author

The whole is greater than the sum of its parts: binary and ternary 5-fluorouracil co-crystals with enhanced selectivity towards metastatic cancer cells.

Chemical communications (Cambridge, England)·2025

Same author

Additive-driven microwave crystallization of tyramine polymorphs and salts: a quantum crystallography perspective.

IUCrJ·2025

Same author

PgRC2: engineering the compression of sequencing reads.

Bioinformatics (Oxford, England)·2025

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 31, 2025

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

PgRC: pseudogenome-based read compressor.

Tomasz M Kowalski¹, Szymon Grabowski¹

¹Institute of Applied Computer Science, Lodz University of Technology, Lodz 90-924, Poland.

Bioinformatics (Oxford, England)

|January 2, 2020

Summary

This summary is machine-generated.

Pseudogenome-based Read Compressor (PgRC) offers improved compression for massive sequencing data. This DNA compression tool achieves superior compression ratios compared to existing methods, efficiently managing large biological datasets.

More Related Videos

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

Published on: March 22, 2018

De novo Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data

De novo Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data

Published on: February 18, 2022

Related Experiment Videos

Last Updated: Dec 31, 2025

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies

Published on: August 20, 2021

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

G2-seq: A High Throughput Sequencing-based Technique for Identifying Late Replicating Regions of the Genome

Published on: March 22, 2018

De novo Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data

De novo Identification of Actively Translated Open Reading Frames with Ribosome Profiling Data

Published on: February 18, 2022

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

High-throughput sequencing generates vast amounts of data, exceeding Moore's Law predictions.
Efficient storage and transmission of this genomic data are critical challenges.
Existing FASTQ compressors have limitations in compression ratio and decompression resources.

Purpose of the Study:

To introduce a novel algorithm for compressing DNA sequencing data.
To address the limitations of current data compression methods in genomics.

Main Methods:

Development of Pseudogenome-based Read Compressor (PgRC), an in-memory algorithm.
Utilizing an approximation of the shortest common superstring for high-quality reads.
Comparative performance analysis against SPRING and Minicom.

Main Results:

PgRC achieves superior compression ratios, outperforming SPRING by up to 15% and Minicom by up to 20% on average.
The algorithm demonstrates comparable decompression speeds to existing methods.
Successful implementation as an in-memory algorithm for DNA stream compression.

Conclusions:

PgRC offers a significant advancement in compressing large-scale sequencing data.
The method provides a more efficient solution for managing the growing volume of genomic information.
PgRC presents a viable alternative for researchers dealing with substantial biological datasets.