Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to form...
Conservation of Protein Domains02:26

Conservation of Protein Domains

Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to form...
Protein Networks02:26

Protein Networks

An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
Protein Networks02:26

Protein Networks

An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
Protein Families02:47

Protein Families

Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key locations, protein...
Protein Families02:47

Protein Families

Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key locations, protein...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Promera: a unified model for biomolecular structure prediction, filtering, and design.

bioRxiv : the preprint server for biology·2026
Same author

Thousandfold Expansion Microscopy.

bioRxiv : the preprint server for biology·2026
Same author

SwitchCraft: A Programmatic Framework for Designing State-Switching Proteins.

ArXiv·2026
Same author

Multi-resolution modeling of a discrete stochastic process identifies causes of cancer.

... International Conference on Learning Representations·2026
Same author

AI-based methods for simulating, sampling, and predicting protein ensembles.

Current opinion in structural biology·2026
Same author

Constrained Diffusion as a Paradigm for Evolution.

bioRxiv : the preprint server for biology·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: May 10, 2026

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes
09:10

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes

Published on: May 22, 2018

Compressive genomics for protein databases.

Noah M Daniels1, Andrew Gallant, Jian Peng

  • 1Department of Computer Science, Tufts University, Medford, MA 02451, USA.

Bioinformatics (Oxford, England)
|July 2, 2013
PubMed
Summary
This summary is machine-generated.

Compressively Accelerated Protein BLAST (CaBLASTP) significantly speeds up homology searches by operating on compressed data. This innovation accelerates protein structure prediction and orthology mapping, overcoming computational bottlenecks in large sequence databases.

More Related Videos

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

Interactome-Seq: A Protocol for Domainome Library Construction, Validation and Selection by Phage Display and Next Generation Sequencing
12:04

Interactome-Seq: A Protocol for Domainome Library Construction, Validation and Selection by Phage Display and Next Generation Sequencing

Published on: October 3, 2018

Related Experiment Videos

Last Updated: May 10, 2026

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes
09:10

A Fast and Quantitative Method for Post-translational Modification and Variant Enabled Mapping of Peptides to Genomes

Published on: May 22, 2018

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

Interactome-Seq: A Protocol for Domainome Library Construction, Validation and Selection by Phage Display and Next Generation Sequencing
12:04

Interactome-Seq: A Protocol for Domainome Library Construction, Validation and Selection by Phage Display and Next Generation Sequencing

Published on: October 3, 2018

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Exponential growth in protein sequence databases creates computational bottlenecks for homology searches.
  • Existing homology search tools, like PSI/DELTA-BLAST, are critical for many bioinformatics tasks.
  • The amount of unique protein data grows slower than the total database size, offering an opportunity for acceleration.

Purpose of the Study:

  • To develop significantly faster and comparably accurate homology search tools.
  • To accelerate existing analysis pipelines by enabling direct substitution of new tools.
  • To address the computational bottleneck in large-scale protein database searching.

Main Methods:

  • Introduction of a novel local similarity-based compression scheme.
  • Development of compressively accelerated protein BLAST (CaBLASTP) tools.
  • Implementation allowing direct integration into existing analysis pipelines.

Main Results:

  • CaBLASTP tools demonstrate significantly higher speed compared to state-of-the-art tools like HHblits, DELTA-BLAST, and PSI-BLAST, with comparable accuracy.
  • CaBLASTP runtime scales almost linearly with the amount of unique data, unlike traditional BLASTP variants.
  • The new algorithms accelerate critical tasks such as protein structure prediction and orthology mapping.

Conclusions:

  • CaBLASTP offers a substantial improvement in homology search efficiency.
  • The compression-based approach effectively addresses the scalability challenge posed by large protein databases.
  • These tools have broad implications for accelerating numerous downstream bioinformatics applications.