Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Protein Organization01:24

Protein Organization

7.2K
Proteins are polymers of amino acid residues. They are versatile and responsible for different cellular functions, including DNA replication, molecular transport, catalysis, and structural support. Proteins have a hierarchical structure comprising at least three levels of organization: primary, secondary, and tertiary structure. Some large proteins have a quaternary structure where individual protein subunits are linked together.
The primary structure of a protein is its amino acid sequence....
7.2K
Protein Organization01:13

Protein Organization

123.3K
Overview
123.3K
Protein Families02:47

Protein Families

13.3K
Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key...
13.3K
Protein and Protein Structure02:15

Protein and Protein Structure

71.5K
Proteins are one of the most abundant organic molecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective. They may serve in transport, storage, or membranes; or they may be toxins or enzymes. Their structures, like their functions, vary greatly. They are all, however, amino acid polymers arranged in a linear sequence.
A protein's shape is critical to its function. For example, an enzyme...
71.5K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

5.8K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
5.8K
Conserved Binding Sites01:49

Conserved Binding Sites

4.1K
Many proteins’ biological role depends on their interactions with their ligands, small molecules that bind to specific locations on the protein known as ligand-binding sites. Ligand-binding sites are often conserved among homologous proteins as these sites are critical for protein function.
Binding sites are often located in large pockets, and if their location on a protein’s surface is unknown, it can be predicted using various approaches. The energetic method computationally...
4.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Alkamines reveal a hidden layer of steroid and drug metabolism.

bioRxiv : the preprint server for biology·2026
Same author

A periplasmic protein complex mediates arabinofuranosyltransferase activity and intrinsic drug resistance in <i>Mycobacterium tuberculosis</i>.

Science advances·2026
Same author

A searchable metadata network graph for microbiome metabolomics.

bioRxiv : the preprint server for biology·2026
Same author

Genes required for <i>Mycobacterium tuberculosis</i> to survive the transition from aerosol to pulmonary alveolar lining fluid and early infection in a model of transmission.

bioRxiv : the preprint server for biology·2025
Same author

Bond-centric modular design of protein assemblies.

Nature materials·2025
Same author

The Mycobacterium tuberculosis Transposon Sequencing Database (MtbTnDB): A Large-Scale Guide to Genetic Conditional Essentiality.

Molecular microbiology·2025
Same journal

QSAR in the Browser: An Interactive Cheminformatics Web Application.

Journal of chemical information and modeling·2026
Same journal

FoldDoF: Utilizing the Primary Degrees of Freedom of Protein Backbone for Geometric Modeling and Generation.

Journal of chemical information and modeling·2026
Same journal

Derisking Affinity Optimization for Macrocycles and Cyclic Peptides: High-Precision Free Energy Simulations across Five Diverse Targets.

Journal of chemical information and modeling·2026
Same journal

An End-User Audit of Reproducibility, Data Leakage, and Overfitting of the Top-Ranked ADMET Prediction Models in TDC Leaderboards.

Journal of chemical information and modeling·2026
Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026
Same journal

DeepKbhb: Context-Aware Prediction of Human Lysine β-Hydroxybutyrylation Sites.

Journal of chemical information and modeling·2026
See all related articles

Related Experiment Video

Updated: May 4, 2026

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

1.3K

Inferring Local Protein Structural Similarity from Sequence Alone.

Zinnia Ma1, Javier Espinoza Herrera2, Elsy Buitrago-Delgado3

  • 1Department of Bioengineering, University of California, San Diego 92093, California, United States.

Journal of Chemical Information and Modeling
|May 2, 2026
PubMed
Summary
This summary is machine-generated.

Protein language models (pLMs) can detect local protein structural similarities using only sequence data. This method identifies conserved motifs without 3D structures, enabling scalable structural annotation and discovery.

More Related Videos

A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

70.1K
Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

9.6K

Related Experiment Videos

Last Updated: May 4, 2026

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

1.3K
A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

70.1K
Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

9.6K

Area of Science:

  • Computational Biology
  • Structural Bioinformatics
  • Machine Learning in Biology

Background:

  • Identifying local protein structural similarity is crucial for understanding protein function and evolution.
  • Current methods often rely on 3D structural models, limiting scalability and accessibility.
  • Protein language models (pLMs) have shown promise in capturing biological information from sequences.

Purpose of the Study:

  • To investigate if protein language models (pLMs) can implicitly capture fine-grained structural signals from sequence data alone.
  • To develop a novel framework for detecting locally aligned structural regions directly from protein sequences.
  • To provide a scalable, sequence-based alternative to traditional structure-based methods for protein analysis.

Main Methods:

  • Utilizing mean-pooling of residue embeddings from pLMs over sliding windows.
  • Employing cosine similarity to compare these pooled embeddings across different proteins.
  • Developing a framework to identify diagonal patterns indicative of local structural alignment.

Main Results:

  • Diagonal patterns in embedding comparisons reveal locally aligned protein regions, even without sequence identity.
  • The proposed framework successfully detects locally aligned structural regions directly from sequences.
  • A case study on SRC homology 3 (SH3) domains demonstrated the recovery of structurally conserved motifs across diverse sequence contexts.

Conclusions:

  • Protein language models implicitly capture structural information, enabling sequence-based detection of local structural similarities.
  • The developed framework offers a lightweight and scalable alternative to structure-based methods.
  • This approach facilitates high-throughput structural discovery and annotation using only protein sequence data.