Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Per-Unit Sequence Models01:26

Per-Unit Sequence Models

An ideal Y-Y transformer, grounded through neutral impedances, displays per-unit sequence networks akin to those of a single-phase ideal transformer when subjected to balanced positive- or negative-sequence currents. These currents do not produce neutral currents, and their associated voltage drops.
Zero-sequence currents, which are identical in magnitude and phase, generate a neutral current, resulting in voltage drops across the neutral impedance and the low-voltage winding. If the...
Maxam-Gilbert Sequencing01:05

Maxam-Gilbert Sequencing

In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...
Encoding01:19

Encoding

Information enters the brain through encoding, which is the input of information into the memory system. Once sensory information is received from the environment, the brain labels or codes it. The information is then organized with similar information and connected to existing concepts. Encoding occurs through automatic processing and effortful processing.
Automatic processing involves the encoding of details like time, space, frequency, and the meaning of words, usually done without conscious...
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved DNA...
Sequences01:29

Sequences

Sequences are fundamental mathematical objects consisting of ordered lists of numbers that follow a specific rule or pattern. Sequences are critical in various mathematical concepts, including calculus, series, and number theory. They can model real-world phenomena such as population growth, financial investments, and physical processes like the diminishing height of a bouncing ball.Each number in a sequence is referred to as a term. Typically, the terms are denoted as a1, a2, a3,…, where the...
Leaky Scanning02:28

Leaky Scanning

During most eukaryotic translation processes, the small 40S ribosome subunit scans an mRNA from its 5' end until it encounters the first start AUG codon. The large 60S ribosomal subunit then joins the smaller one to initiate protein synthesis. The location of the translation initiation is largely determined by the nucleotides near the start codon as there may be multiple translation initiation sites present on the mRNA.  Marilyn Kozak discovered that the sequence RCCAUGG (where R stands for...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Reduction of next-generation sequencing files for efficient pharmacogenotype extraction.

Pharmacogenomics·2026
Same author

LOCALE: Local-Alignment Embeddings for Noise-Robust DNA Search at SRA Scale.

bioRxiv : the preprint server for biology·2026
Same author

Rapid adaptive increase of amylase gene copy number in Indigenous Andeans.

Nature communications·2026
Same author

Identifying Robust Subclonal Structures through Tumor Progression Tree Alignment.

bioRxiv : the preprint server for biology·2026
Same author

LCPAN: efficient variation graph construction using locally consistent parsing.

Genome biology·2026
Same author

Genomes from 117 vertebrate species reveal rapidly evolving segmental-duplication landscapes.

Genome biology and evolution·2026
Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026
Same journal

KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.

Bioinformatics (Oxford, England)·2026
Same journal

IDR searcher: a search engine solution for public image resources.

Bioinformatics (Oxford, England)·2026
Same journal

KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.

Bioinformatics (Oxford, England)·2026
Same journal

Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.

Bioinformatics (Oxford, England)·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: May 17, 2026

BEST: Barcode Enabled Sequencing of Tetrads
12:59

BEST: Barcode Enabled Sequencing of Tetrads

Published on: May 1, 2014

SCALCE: boosting sequence compression algorithms using locally consistent encoding.

Faraz Hach1, Ibrahim Numanagic, Can Alkan

  • 1School of Computing Science, Simon Fraser University, Burnaby, Canada, V5A 1S6. fhach@cs.sfu.ca

Bioinformatics (Oxford, England)
|October 11, 2012
PubMed
Summary
This summary is machine-generated.

SCALCE significantly enhances high throughput sequencing data compression. This novel algorithm improves compression rates and speeds, outperforming existing methods for genomic data management.

Related Experiment Videos

Last Updated: May 17, 2026

BEST: Barcode Enabled Sequencing of Tetrads
12:59

BEST: Barcode Enabled Sequencing of Tetrads

Published on: May 1, 2014

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • High throughput sequencing (HTS) platforms generate massive datasets, posing significant computational infrastructure challenges.
  • Data management, storage, and analysis are major logistical hurdles for HTS adoption.
  • General-purpose compression algorithms like gzip are suboptimal for genomic data due to its specific characteristics.

Purpose of the Study:

  • To develop a fast and efficient compression algorithm specifically designed for HTS data.
  • To address challenges in data management, storage, and communication for large genomic datasets.
  • To provide additional capabilities for data analysis, such as random access and indexing.

Main Methods:

  • Introduction of SCALCE (Sequence Compression Algorithm using Locally Consistent Encoding), a 'boosting' scheme based on Locally Consistent Parsing.
  • Reorganization of reads to improve compression speed and rate, independent of the compression algorithm and without using a reference genome.
  • Implementation in C++ with gzip and bzip2 compression options, supporting multithreading.

Main Results:

  • SCALCE improves gzip compression rate by up to 4.19x for reads alone.
  • SCALCE + gzip improves gzip running time by 2.09x and compression rate by up to 3.34x for FASTQ files.
  • SCALCE + gzip outperforms BEETL, offering better compression (2.01x) and speed improvements (5.17x).

Conclusions:

  • SCALCE offers significant improvements in compression rate and speed for HTS data.
  • The algorithm effectively compresses reads, read names, and quality scores.
  • SCALCE provides a valuable tool for managing and analyzing large-scale genomic data.