Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

13.6K
Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...
13.6K
Protein and Protein Structure02:15

Protein and Protein Structure

84.6K
Proteins are one of the most abundant organic molecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective. They may serve in transport, storage, or membranes; or they may be toxins or enzymes. Their structures, like their functions, vary greatly. They are all, however, amino acid polymers arranged in a linear sequence.
A protein's shape is critical to its function. For example, an enzyme...
84.6K
Protein Networks02:26

Protein Networks

4.3K
An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
4.3K
Protein Networks02:26

Protein Networks

2.6K
2.6K
Protein and Protein Structures02:15

Protein and Protein Structures

15.9K
15.9K
Peptide Identification Using Tandem Mass Spectrometry01:33

Peptide Identification Using Tandem Mass Spectrometry

7.6K
Tandem mass spectrometry, also known as MS/MS or MS2, is an analytical technique that employs two mass analyzers. Essentially it is a series of mass spectrometers that helps isolate a particular biomolecule and then helps study its chemical properties.
This technique helps gather information regarding the protein from which the peptide was obtained and to study the peptides’ amino acid sequence. Identifying peptides from a complex mixture is an important component of the growing field of...
7.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

MMTF-DTI: Drug-target interaction prediction via multimodal feature extraction and dynamic fusion.

Journal of biomedical informatics·2026
Same author

RPI-PLMGNN: Enhancing RNA-Protein Interaction Prediction with the Pretrained Large Language Models and Graph Neural Networks.

ACS synthetic biology·2026
Same author

MPMFMol: Multitask Self-Supervised Pretraining with Multimodal Fine-Tuning for Molecular Property Prediction.

Journal of chemical information and modeling·2026
Same author

scHilda: Hierarchical Integration of LLM with KG database for single cell type annotation.

PLoS computational biology·2026
Same author

MsgaBpred: A B-cell epitope predictor integrating AlphaFold3-predicted structures with multi-scale GCNs and pre-trained language model ESM-C.

PLoS computational biology·2026
Same author

INB<sup>3</sup>P: A Multi-Modal and Interpretable Co-Attention Framework Integrating Property-Aware Explanations and Memory-Bank Contrastive Fusion for Blood-Brain Barrier Penetrating Peptide Discovery.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2026
Same journal

Multi-view knowledge-guided flow subgraphs with substructure initialization for explainable DDI prediction.

Briefings in functional genomics·2026
Same journal

Genetically supported mediators linking peripheral metabolism to cerebral ischemia: a multi-omics characterization of HMGCR, TLR4, and MMP9 in angina pectoris and stroke.

Briefings in functional genomics·2026
Same journal

Language model-based self-training reduces labeled data requirements by 99% for biological sequence classification.

Briefings in functional genomics·2026
Same journal

Whole-transcriptome sequencing reveals hypoxic esophageal squamous cell carcinoma-derived migrasomes driving cancer-associated fibroblast activation.

Briefings in functional genomics·2026
Same journal

An integrative meta-analysis of SARS-CoV-2 RNA-protein interactomes identifies conserved host factors shared with other RNA viruses.

Briefings in functional genomics·2026
Same journal

Retraction and replacement of: An integrated complete-genome sequencing and systems biology approach to predict antimicrobial resistance genes in the virulent bacterial strains of Moraxella catarrhalis.

Briefings in functional genomics·2026
See all related articles

Related Experiment Video

Updated: Nov 19, 2025

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data
09:34

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

4.3K

Sequence representation approaches for sequence-based protein prediction tasks that use deep learning.

Feifei Cui1, Zilong Zhang1, Quan Zou2

  • 1University of Electronic Science and Technology of China, Chengdu, Sichuan, China.

Briefings in Functional Genomics
|February 2, 2021
PubMed
Summary
This summary is machine-generated.

This review covers protein sequence embedding methods for deep learning in bioinformatics. It details various approaches to represent amino acid sequences, aiding researchers in selecting optimal models for protein prediction tasks.

Keywords:
deep learningend-to-end learningprotein sequence embeddingsequence representationtransfer learning

More Related Videos

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

3.8K
Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

629

Related Experiment Videos

Last Updated: Nov 19, 2025

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data
09:34

A Virtual Machine Platform for Non-Computer Professionals for Using Deep Learning to Classify Biological Sequences of Metagenomic Data

Published on: September 25, 2021

4.3K
An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

3.8K
Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

629

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Machine Learning

Background:

  • Deep learning is increasingly vital in bioinformatics for sequence-based protein prediction.
  • Availability of large biological datasets and rapid advancements in deep learning drive this trend.

Purpose of the Study:

  • To summarize main approaches for protein sequence data representation (encoding/embedding).
  • To review architectures and development of various sequence embedding models.
  • To aid researchers in selecting suitable models for their requirements.

Main Methods:

  • Categorization of protein sequence embedding methods: end-to-end, non-contextual, and transfer learning-based.
  • Inclusion of task-specific embeddings (e.g., for structure prediction, drug discovery).
  • Theoretical review of embedding model architectures and their evolution.

Main Results:

  • Comprehensive overview of diverse protein sequence embedding strategies.
  • Detailed explanation of different embedding methodologies and their applications.
  • Identification of key factors influencing model performance in sequence-based prediction.

Conclusions:

  • Effective protein sequence representation is crucial for deep learning model performance.
  • A wide array of embedding techniques are available, catering to various bioinformatics tasks.
  • This review provides a valuable resource for selecting appropriate sequence embedding models.