Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Peptide Identification Using Tandem Mass Spectrometry01:33

Peptide Identification Using Tandem Mass Spectrometry

6.8K
Tandem mass spectrometry, also known as MS/MS or MS2, is an analytical technique that employs two mass analyzers. Essentially it is a series of mass spectrometers that helps isolate a particular biomolecule and then helps study its chemical properties.
This technique helps gather information regarding the protein from which the peptide was obtained and to study the peptides’ amino acid sequence. Identifying peptides from a complex mixture is an important component of the growing field of...
6.8K
Peptide Bonds02:43

Peptide Bonds

77.4K
A peptide bond covalently attaches amino acids through a dehydration reaction. One amino acid's carboxyl group and another amino acid's amino group combine, releasing a water molecule. The resulting bond is the peptide bond. The products that such linkages form are peptides. As more amino acids join this growing chain, the resulting chain is a polypeptide. Each polypeptide has a free amino group at one end. This end has the N-terminal, or the amino-terminal, and the other end has a free...
77.4K
Protein-Protein Interfaces02:04

Protein-Protein Interfaces

3.9K
3.9K
Protein-protein Interfaces02:04

Protein-protein Interfaces

13.4K
Many proteins form complexes to carry out their functions, making protein-protein interactions (PPIs) essential for an organism's survival. Most PPIs are stabilized by numerous weak noncovalent chemical forces. The physical shape of the interfaces determines the way two proteins interact. Many globular proteins have closely-matching shapes on their surfaces, which form a large number of weak bonds. Additionally, many PPIs occur between two helices or between a surface cleft and a...
13.4K
Ligand Binding Sites02:40

Ligand Binding Sites

13.4K
Proteins are dynamic macromolecules that carry out a wide variety of essential processes; however, the activities of most proteins depend on their interactions with other molecules or ions, known as ligands.
Protein-ligand interactions are quite specific; even though numerous potential ligands surround a cellular protein at any given time, only a particular ligand can bind to that protein. Moreover, a ligand binds only to a dedicated area on the surface of the protein, known as the...
13.4K
Protein Networks02:26

Protein Networks

2.4K
2.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Automated Behavior Analysis in the Novel Object Recognition Test.

Neurocomputing·2026
Same author

Using machine learning to identify individuals at elevated risk of diabetes from low-cost complete blood count data for opportunistic screening.

Scientific reports·2026
Same author

Towards clinical-level interpretation of dental panoramic radiography using an instance-guided vision-language model.

Nature biomedical engineering·2026
Same author

Toward a Comprehensive Pea Aphid Saliva-Proteomewith Insights from Transcripts from the Whitefly <i>Bemisia tabaci</i>.

Biochemistry & molecular biology journal·2026
Same author

Artificial intelligence in food allergen detection and prediction: advances, methodologies, and challenges.

Critical reviews in food science and nutrition·2026
Same author

EZH2 deficiency suppresses colorectal cancer progression by inhibiting the mismatch repair pathway and consequently reducing extrachromosomal circular DNA formation.

Cell death & disease·2026
Same journal

Layered social competition coordinates reproductive hierarchy formation in ants.

bioRxiv : the preprint server for biology·2026
Same journal

Combination epigenetic-targeted therapy increases the immunogenicity of poorly immunogenic sarcomas.

bioRxiv : the preprint server for biology·2026
Same journal

Loss of LanC-like proteins delays post-injury regeneration of aging skeletal muscles.

bioRxiv : the preprint server for biology·2026
Same journal

Integrative Transfer Network: Deep Transfer Learning Across Populations and Prediction Targets.

bioRxiv : the preprint server for biology·2026
Same journal

Confidence-supported label-free metabolic imaging with FPhaS phase autofluorescence microscopy.

bioRxiv : the preprint server for biology·2026
Same journal

Sequence-encoded autoinhibition couples mRNA decapping activity to phase separation.

bioRxiv : the preprint server for biology·2026
See all related articles

Related Experiment Video

Updated: Sep 16, 2025

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions
06:50

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions

Published on: January 26, 2024

2.0K

PepBERT: Lightweight language models for bioactive peptide representation.

Zhenjiao Du1, Doina Caragea2, Xiaolong Guo3

  • 1Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506, USA.

Biorxiv : the Preprint Server for Biology
|July 9, 2025
PubMed
Summary
This summary is machine-generated.

Protein language models (pLMs) underrepresent short peptides. PepBERT, a new peptide language model, offers superior performance on peptide tasks, accelerating the discovery of bioactive peptides for functional foods.

Keywords:
Scientific large language modelbioactive peptidedrug discoverymachine learningpeptide representation

More Related Videos

Split-and-pool Synthesis and Characterization of Peptide Tertiary Amide Library
13:37

Split-and-pool Synthesis and Characterization of Peptide Tertiary Amide Library

Published on: June 20, 2014

18.3K
Peptide-based Identification of Functional Motifs and their Binding Partners
14:28

Peptide-based Identification of Functional Motifs and their Binding Partners

Published on: June 30, 2013

12.6K

Related Experiment Videos

Last Updated: Sep 16, 2025

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions
06:50

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions

Published on: January 26, 2024

2.0K
Split-and-pool Synthesis and Characterization of Peptide Tertiary Amide Library
13:37

Split-and-pool Synthesis and Characterization of Peptide Tertiary Amide Library

Published on: June 20, 2014

18.3K
Peptide-based Identification of Functional Motifs and their Binding Partners
14:28

Peptide-based Identification of Functional Motifs and their Binding Partners

Published on: June 30, 2013

12.6K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Peptide Science

Background:

  • Protein language models (pLMs) are effective for protein and peptide tasks.
  • Short peptides (<50 residues) are underrepresented in standard pLM training data (e.g., UniProt Reference Cluster, 2.8%).
  • This underrepresentation limits the efficacy of pLMs for peptide-specific applications.

Purpose of the Study:

  • To develop a specialized language model for encoding peptide sequences.
  • To address the limitations of existing pLMs in handling short peptides.
  • To enhance the discovery of bioactive peptides for functional foods.

Main Methods:

  • Development of PepBERT, a lightweight peptide language model, with two versions: PepBERT-large (4.9M parameters) and PepBERT-small (1.86M parameters).
  • Pretraining PepBERT models from scratch using four custom peptide datasets.
  • Evaluation of PepBERT models on nine peptide-related downstream prediction tasks, comparing performance against the benchmark ESM-2 (7.5M parameters).

Main Results:

  • PepBERT models achieved superior or comparable performance to ESM-2 on 8 out of 9 evaluated peptide prediction tasks.
  • PepBERT provides a compact and efficient solution for generating high-quality peptide representations.
  • The models demonstrate effectiveness in encoding peptide sequences, outperforming a larger benchmark model.

Conclusions:

  • PepBERT offers a specialized and effective solution for peptide sequence encoding and downstream applications.
  • The model can accelerate the discovery of food-derived bioactive peptides with health benefits.
  • PepBERT supports the development of sustainable functional foods and the utilization of food processing by-products.