Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Tagging and Fusion Proteins01:24

Tagging and Fusion Proteins

7.7K
Proteins are involved in several cellular processes and biochemical reactions. Analyzing a specific protein of interest requires it to be isolated from the other proteins in the cell. This is achieved by overexpressing the specific gene in a suitable host to produce large quantities of the target protein. A tag or label is recombined with the gene to produce a fusion protein containing the target protein and the tag. The tags on these fusion proteins can then be used for easy detection and...
7.7K
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

19.8K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
19.8K
Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

13.6K
Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...
13.6K
Protein Networks02:26

Protein Networks

2.6K
2.6K
Protein Networks02:26

Protein Networks

4.3K
An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
4.3K
Sequence Networks of Rotating Machines01:24

Sequence Networks of Rotating Machines

346
A Y-connected synchronous generator, grounded through a neutral impedance, is designed to produce balanced internal phase voltages with only positive-sequence components. The generator's sequence networks include a source voltage that is exclusively in the positive-sequence network. The sequence components of line-to-ground voltages at the generator terminals illustrate this configuration.
Zero-sequence current induces a voltage drop across the generator's neutral impedance and other...
346

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

On the state of protein function prediction: a report on the fourth CAFA challenge.

bioRxiv : the preprint server for biology·2026
Same author

ST-PARM: Pareto-Complete Inference-Time Alignment for Multi-Objective Protein Design.

bioRxiv : the preprint server for biology·2026
Same author

Multimodal diffusion for joint design of protein sequence and structure.

Protein science : a publication of the Protein Society·2025
Same author

Current and future directions in network biology.

Bioinformatics advances·2024
Same author

Novel antibody language model accelerates IgG screening and design for broad-spectrum antiviral therapy.

bioRxiv : the preprint server for biology·2024
Same author

Predicting protein structure from single sequences.

Nature computational science·2024
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Nov 11, 2025

TurboID-Based Proximity Labeling for In Planta Identification of Protein-Protein Interaction Networks
07:02

TurboID-Based Proximity Labeling for In Planta Identification of Protein-Protein Interaction Networks

Published on: May 17, 2020

25.2K

TALE: Transformer-based protein function Annotation with joint sequence-Label Embedding.

Yue Cao1, Yang Shen1

  • 1Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.

Bioinformatics (Oxford, England)
|March 23, 2021
PubMed
Summary
This summary is machine-generated.

A new deep learning model, Transformer-based protein function Annotation through joint sequence-Label Embedding (TALE), accurately predicts protein functions using only sequence data. TALE demonstrates superior generalizability to novel sequences and functions, advancing computational protein analysis.

More Related Videos

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.5K
Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

7.4K

Related Experiment Videos

Last Updated: Nov 11, 2025

TurboID-Based Proximity Labeling for In Planta Identification of Protein-Protein Interaction Networks
07:02

TurboID-Based Proximity Labeling for In Planta Identification of Protein-Protein Interaction Networks

Published on: May 17, 2020

25.2K
Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.5K
Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

7.4K

Area of Science:

  • Computational biology
  • Bioinformatics
  • Machine learning in genomics

Background:

  • High-throughput sequencing generates vast data, but functional insights lag.
  • Existing computational protein function annotation methods face limitations in applicability and generalizability.
  • Reliance on non-sequence data or poor generalization to novel sequences/functions hinders progress.

Purpose of the Study:

  • To develop a novel deep learning model for protein function annotation using solely sequence information.
  • To enhance generalizability to novel sequences, species, and functions.
  • To provide a high-throughput, accurate, and broadly applicable alternative to experimental methods.

Main Methods:

  • Proposed Transformer-based protein function Annotation through joint sequence-Label Embedding (TALE) model.
  • Utilized self-attention-based transformers to capture global sequence patterns for generalizability.
  • Employed joint embedding of protein function labels (hierarchical GO terms) and sequence features for unseen functions.
  • Combined TALE with a sequence similarity method (TALE+) for enhanced performance.

Main Results:

  • TALE+ outperformed competing methods using only sequence input.
  • TALE+ surpassed a state-of-the-art network-based method in two out of three Gene Ontology (GO) categories.
  • Demonstrated superior generalizability to proteins with low sequence similarity, new species, and rare functions.
  • Ablation studies confirmed the contributions of algorithmic components to accuracy and generalizability.

Conclusions:

  • TALE offers a powerful, sequence-only approach for accurate protein function prediction.
  • The model exhibits significant improvements in generalizability, addressing key limitations of current methods.
  • TALE provides deep insights into protein sequence-function relationships and advances computational annotation capabilities.