Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Comparing Copy Number Variations and SNPs02:26

Comparing Copy Number Variations and SNPs

19.0K
Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...
19.0K
Next-generation Sequencing03:00

Next-generation Sequencing

100.5K
The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features....
100.5K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.7K
No description available
3.7K
Improving Translational Accuracy02:07

Improving Translational Accuracy

15.4K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
15.4K
Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

16.4K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
16.4K
Genomics02:02

Genomics

41.4K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
41.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Genome-scale mapping of variant, enhancer and gene function in primary human CD4+ T cells.

bioRxiv : the preprint server for biology·2026
Same author

Multimodal plasma and urinary cell-free DNA profiling improves risk stratification in newly diagnosed prostate cancer.

NPJ precision oncology·2026
Same author

Federated single-cell QTL meta-analysis reveals novel disease mechanisms.

bioRxiv : the preprint server for biology·2026
Same author

Interpretation, extrapolation and perturbation of single cells.

Nature reviews. Genetics·2026
Same author

Genetic architecture and mechanisms of host-microbiome interactions from a multi-cohort analysis of outbred laboratory rats.

Nature communications·2025
Same author

Escape from X inactivation is directly modulated by Xist noncoding RNA.

Nature cell biology·2025
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Mar 15, 2026

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.8K

GeneCodeq: quality score compression and improved genotyping using a Bayesian framework.

Daniel L Greenfield1, Oliver Stegle2, Alban Rrustemi1

  • 1PetaGene, Ideaspace, 3 Charles Babbage Rd, Cambridge CB3 0GT, UK.

Bioinformatics (Oxford, England)
|June 30, 2016
PubMed
Summary
This summary is machine-generated.

GeneCodeq, a new Bayesian method, improves genomic data compression by adjusting quality scores. This method enhances compressibility without compromising genotyping accuracy, offering significant file size reduction.

More Related Videos

qPCRTag Analysis - A High Throughput, Real Time PCR Assay for Sc2.0 Genotyping
07:00

qPCRTag Analysis - A High Throughput, Real Time PCR Assay for Sc2.0 Genotyping

Published on: May 25, 2015

18.0K
Targeted DNA Methylation Analysis by Next-generation Sequencing
08:38

Targeted DNA Methylation Analysis by Next-generation Sequencing

Published on: February 24, 2015

38.2K

Related Experiment Videos

Last Updated: Mar 15, 2026

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.8K
qPCRTag Analysis - A High Throughput, Real Time PCR Assay for Sc2.0 Genotyping
07:00

qPCRTag Analysis - A High Throughput, Real Time PCR Assay for Sc2.0 Genotyping

Published on: May 25, 2015

18.0K
Targeted DNA Methylation Analysis by Next-generation Sequencing
08:38

Targeted DNA Methylation Analysis by Next-generation Sequencing

Published on: February 24, 2015

38.2K

Area of Science:

  • Genomics
  • Bioinformatics
  • Data Compression

Background:

  • The cost of genome sequencing has decreased exponentially, leading to a massive increase in genomic data.
  • Quality scores, which measure sequencing confidence, contain significant entropy in short-read data.
  • Current lossless compression methods are nearing their theoretical limits, necessitating lossy approaches.

Purpose of the Study:

  • To develop a novel method, GeneCodeq, for compressing genomic quality scores.
  • To improve the compressibility of quality scores without negatively impacting genotyping accuracy.
  • To provide theoretical insights into corpus-based quality score compression.

Main Methods:

  • GeneCodeq employs a Bayesian approach inspired by coding theory.
  • The method utilizes a k-mer corpus to reduce the entropy of quality scores.
  • It allows for the incorporation of a reference panel for enhanced accuracy.

Main Results:

  • GeneCodeq achieves compression ratios superior to existing methods for FASTQ and SAM/BAM/CRAM files.
  • The method can be combined with other lossy compression techniques for further entropy reduction.
  • Empirical evaluations demonstrate improved genotyping accuracy as a side effect of compression.

Conclusions:

  • GeneCodeq offers an effective strategy for compressing genomic quality scores.
  • The method provides significant file size reduction while maintaining or improving data utility.
  • GeneCodeq represents a valuable tool for managing the growing volume of genomic data.