Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
Gene Families01:57

Gene Families

Gene families consist of groups of genes proposed to have originated from a common ancestor. Typically these arise through events in which a gene or genes are mistakenly duplicated during cell division. Unlike their parent genes (which are subject to selection pressure to maintain function), these gene copies do not need to preserve their sequences and may evolve at a relatively faster rate.
Occasionally these regions can be adapted to take on new roles within the organism, becoming novel genes...
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved DNA...
Globular and Fibrous Proteins02:21

Globular and Fibrous Proteins

Many proteins can be classified into two distinct subtypes - globular or fibrous. These two types differ in their shapes and solubilities.
Globular proteins are also known as spheroproteins and typically are approximately round in shape. They contain a mix of amino acid types and contain differing sequences in their primary structures. Globular proteins have many different functions, such as enzymes, cellular messengers, and molecular transporters. These roles often require the proteins to be...
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.
Genomics02:02

Genomics

Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Quantum ensembling methods for healthcare and life science.

Briefings in bioinformatics·2026
Same author

Publisher Correction: Advancing single-cell omics and cell-based therapeutics with quantum computing.

Nature reviews. Molecular cell biology·2026
Same author

Advancing single-cell omics and cell-based therapeutics with quantum computing.

Nature reviews. Molecular cell biology·2026
Same author

Probing omics data via harmonic persistent homology.

Scientific reports·2025
Same author

BioSet2Vec: extraction of k-mer dictionaries from multiple sets of biological sequences via big data technologies.

BMC bioinformatics·2025
Same author

How quantum computing can enhance biomarker discovery.

Patterns (New York, N.Y.)·2025
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Jun 25, 2026

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

Textual data compression in computational biology: a synopsis.

Raffaele Giancarlo1, Davide Scaturro, Filippo Utro

  • 1Dipartimento di Matematica ed Applicazioni, Università di Palermo, Palermo, Italy. raffaele@math.unipa.it

Bioinformatics (Oxford, England)
|March 3, 2009
PubMed
Summary
This summary is machine-generated.

Textual data compression, rooted in information theory, offers powerful tools for bioinformatics and computational biology. These techniques enhance data storage, analysis, and network engineering, driving advancements in biological research.

More Related Videos

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Published on: May 9, 2017

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web
09:51

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

Related Experiment Videos

Last Updated: Jun 25, 2026

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms
10:41

Leveraging CyVerse Resources for De Novo Comparative Transcriptomics of Underserved (Non-model) Organisms

Published on: May 9, 2017

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web
09:51

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

Area of Science:

  • Bioinformatics and Computational Biology
  • Information Theory
  • Data Science

Background:

  • Textual data compression, originating from information theory, is traditionally linked to data communication and storage.
  • Emerging applications demonstrate deep connections between compression techniques and classification, data mining, and analysis.
  • Significant research effort has focused on applying compression to computational biology for tasks like large dataset management and biological network analysis.

Purpose of the Study:

  • To systematically review the primary applications of compression techniques within bioinformatics and computational biology.
  • To provide a unified framework for understanding the core ideas and methodologies employed in this interdisciplinary field.
  • To highlight the practical relevance and software resources available to the research community.

Main Methods:

  • Systematic literature review focusing on compression techniques in bioinformatics.
  • Categorization and organization of key compression applications in computational biology.
  • Identification and presentation of fundamental theoretical concepts and practical software tools.

Main Results:

  • Identification of diverse applications of compression in bioinformatics, including data storage, indexing, and network analysis.
  • A structured overview of compression methodologies relevant to biological data.
  • Provision of pointers to software prototypes and benchmark datasets for community use.

Conclusions:

  • Compression techniques are integral to modern bioinformatics and computational biology, extending beyond traditional data handling.
  • The reviewed methods offer significant potential for advancing biological data analysis and network engineering.
  • Accessible software and data resources facilitate the adoption and further development of these techniques.