Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

7.3K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
7.3K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.8K
3.8K
Improving Translational Accuracy02:07

Improving Translational Accuracy

15.7K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
15.7K
Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

17.3K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
17.3K
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

22.3K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
22.3K
Compacting Factor test01:22

Compacting Factor test

736
The compacting factor test is a method used to assess the workability of concrete. It is  especially suitable for concrete mixes containing aggregates up to one and a half inches in size. This test involves specialized equipment consisting of two truncated cone-shaped hoppers and a cylinder, all with polished interior surfaces to minimize friction.
The procedure begins by placing concrete into the upper hopper without any compaction. Once filled, the bottom door of this hopper is opened,...
736

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Stromal and endothelial transcriptional changes during progression from MGUS to myeloma and after treatment response.

Nature communications·2026
Same author

Single-cell multiomics reveals regulatory mechanisms of CAR T-cell persistence and dysfunction in multiple myeloma.

Blood neoplasia·2026
Same author

Compression detects changes in spiking neural data from cortical lesions.

Journal of neural engineering·2026
Same author

Information-computation trade-offs in nonlinear transforms.

Philosophical transactions. Series A, Mathematical, physical, and engineering sciences·2026
Same author

Compression Detects Changes in Spiking Neural Data from Cortical Lesions.

bioRxiv : the preprint server for biology·2026
Same author

Improving CRISPR-Cas9 Screens in CAR T Cells: A Refined Method for Library Preparation.

Journal of visualized experiments : JoVE·2026
Same journal

CNV-ECOD: A copy number variation detection method based on ECOD algorithm using next-generation sequencing data.

Journal of bioinformatics and computational biology·2026
Same journal

ReinVar: A model-free paradigm-based reinforcement learning approach to detect copy number variation.

Journal of bioinformatics and computational biology·2026
Same journal

When pipelines run but coordinates fail: A simple spatial specificity check for false locality in post-GWAS analysis.

Journal of bioinformatics and computational biology·2026
Same journal

Comparative benchmarking of template-based, evolutionary-diffusion, and generative language models for IsPETase structure prediction.

Journal of bioinformatics and computational biology·2026
Same journal

Trap spaces as labelled ideals of SCC posets: A structural-functional theory of reachability in asynchronous boolean networks.

Journal of bioinformatics and computational biology·2026
Same journal

Erratum - DDINet: Drug-drug interaction prediction network based on multi-molecular fingerprint features and multi-head attention centered weighted autoencoder.

Journal of bioinformatics and computational biology·2026
See all related articles

Related Experiment Video

Updated: Apr 21, 2026

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.9K

Aligned genomic data compression via improved modeling.

Idoia Ochoa1, Mikel Hernaez, Tsachy Weissman

  • 1Department of Electrical Engineering, Stanford University, 350 Serra Mall, Stanford, CA, USA.

Journal of Bioinformatics and Computational Biology
|November 15, 2014
PubMed
Summary
This summary is machine-generated.

Affordable whole-genome sequencing generates massive data. This study shows that data modeling significantly improves compression of aligned sequencing reads, enabling efficient storage and analysis for personalized medicine.

Keywords:
SAM filecompressioncontext modeling

More Related Videos

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
12:39

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

11.8K
Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance
04:58

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Published on: December 13, 2024

4.9K

Related Experiment Videos

Last Updated: Apr 21, 2026

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.9K
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
12:39

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

11.8K
Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance
04:58

Author Spotlight: Investigating the Role of Repetitive DNA Misregulation in Cancer Initiation and Immunotherapy Resistance

Published on: December 13, 2024

4.9K

Area of Science:

  • Genomics
  • Bioinformatics
  • Data Compression

Background:

  • Next-Generation Sequencing (NGS) advancements, like Illumina's HiSeq X, enable affordable whole-genome sequencing ($1000 per human genome).
  • This leads to unprecedented volumes of genomic data, necessitating efficient storage and processing solutions.
  • Current data compression methods often focus on low-coverage data and lack effective data modeling for aligned reads.

Purpose of the Study:

  • To demonstrate the benefits of data modeling for compressing aligned sequencing reads.
  • To improve compression ratios beyond existing algorithms for high-coverage genomic data.
  • To develop compressed data formats suitable for direct use in downstream bioinformatics applications.

Main Methods:

  • Developing and applying data models specifically designed for aligned sequencing data.
  • Evaluating compression performance against established algorithms, particularly for high-coverage datasets.
  • Assessing the suitability of the compressed data for direct downstream analysis.

Main Results:

  • Data modeling significantly enhances the compression ratio of aligned sequencing reads compared to previous methods.
  • The Pareto-optimal barrier for compression rate and speed, previously reported for low-coverage data, is not applicable to high-coverage aligned data.
  • The proposed compression method splits data in a way that facilitates direct operations in the compressed domain.

Conclusions:

  • Effective data modeling is crucial for achieving superior compression of aligned genomic data.
  • This approach overcomes limitations of existing compressors for high-coverage datasets.
  • The developed compression strategy supports efficient storage and direct analysis of genomic data, advancing personalized medicine.