Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Assembly of Cytoskeletal Filaments01:18

Assembly of Cytoskeletal Filaments

18.3K
Cytoskeletal filaments are polymeric forms of smaller protein subunits. However, individual cytoskeletal filaments may easily disassemble or associate with other similar filaments to form rigid structures. Microfilaments, made of actin monomers, rely on actin-binding proteins to form bundles and create networks of individual actin filaments. Microtubules rely on microtubule-associated proteins (MAPs) to form sturdy cylindrical structures. However, the proteins involved in forming complex...
18.3K
Protein Organization01:24

Protein Organization

6.3K
Proteins are polymers of amino acid residues. They are versatile and responsible for different cellular functions, including DNA replication, molecular transport, catalysis, and structural support. Proteins have a hierarchical structure comprising at least three levels of organization: primary, secondary, and tertiary structure. Some large proteins have a quaternary structure where individual protein subunits are linked together.
The primary structure of a protein is its amino acid sequence....
6.3K
Protein-protein Interfaces02:04

Protein-protein Interfaces

12.5K
Many proteins form complexes to carry out their functions, making protein-protein interactions (PPIs) essential for an organism's survival. Most PPIs are stabilized by numerous weak noncovalent chemical forces. The physical shape of the interfaces determines the way two proteins interact. Many globular proteins have closely-matching shapes on their surfaces, which form a large number of weak bonds. Additionally, many PPIs occur between two helices or between a surface cleft and a...
12.5K
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

18.8K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
18.8K
Globular and Fibrous Proteins02:21

Globular and Fibrous Proteins

43.4K
Many proteins can be classified into two distinct subtypes - globular or fibrous. These two types differ in their shapes and solubilities.
Globular proteins are also known as spheroproteins and typically are approximately round in shape. They contain a mix of amino acid types and contain differing sequences in their primary structures. Globular proteins have many different functions, such as enzymes, cellular messengers, and molecular transporters. These roles often require the proteins to be...
43.4K
Amyloid Fibrils03:03

Amyloid Fibrils

9.3K
Amyloid fibrils are aggregates of misfolded proteins.  Under most circumstances, misfolded proteins are either refolded by chaperone proteins or degraded by the proteasome. However, in the case of a mutation or a disease, these proteins can accumulate to form large clusters and often further assemble to form elongated fibers, called fibrils. 
Amyloid deposits were observed as early as 1639 in the liver and the spleen.   In 1854, Rudolph Virchow performed iodine staining,...
9.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same authorSame journal

Investigations on Multiple Protein Scaffold Filling.

Journal of computational biology : a journal of computational molecular cell biology·2026
Same author

Explainable convolutional neural network model provides an alternative genome-wide association perspective on mutations in SARS-CoV-2.

Scientific reports·2026
Same author

EssentCell: Discovering Essential Evolutionary Relations in Noisy Single-Cell Data.

IEEE transactions on computational biology and bioinformatics·2026
Same author

Privacy-preserving federated learning with optimized ensemble weighting and knowledge distillation for COVID-19 detection from non-IID medical imaging data.

Scientific reports·2026
Same author

A data-driven sliding-window pairwise comparative approach for the estimation of transmission fitness of SARS-CoV-2 variants and construction of the evolution fitness landscape.

Quantitative biology (Beijing, China)·2026
Same author

Enhancing Proteoform Sequence Coverage Using Top-Down Mass Spectrometry with In-Source Fragmentation and Middle-Down Mass Spectrometry.

Analytical chemistry·2026
Same journal

GMSA: A Graph Matching and Point Cloud Registration-Based Method for Spatial Transcriptomics Data Alignment.

Journal of computational biology : a journal of computational molecular cell biology·2026
Same journal

Cell Type Prediction for Single-Cell RNA Sequencing Utilizing Unsupervised Domain Adaptation and Semi-Supervised Learning.

Journal of computational biology : a journal of computational molecular cell biology·2026
Same journal

PPIGAN: Prediction of Protein-Protein Interactions Using Generative Adversarial Networks.

Journal of computational biology : a journal of computational molecular cell biology·2026
Same journal

Deep Structure-Enhanced Cell Clustering Model for Single-Cell RNA Sequencing Data.

Journal of computational biology : a journal of computational molecular cell biology·2026
Same journal

Asymmetric Drug-Drug Interaction Prediction Based on Generative Adversarial Networks and Knowledge Graph.

Journal of computational biology : a journal of computational molecular cell biology·2026
See all related articles

Related Experiment Video

Updated: Jun 9, 2025

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
10:58

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules

Published on: July 25, 2013

17.0K

Generative AI Models for the Protein Scaffold Filling Problem.

Letu Qingge1, Kushal Badal1, Richard Annan1

  • 1Department of Computer Science, North Carolina A&T State University, Greensboro, North Carolina, USA.

Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
|October 23, 2024
PubMed
Summary
This summary is machine-generated.

Generative AI models, including GPT-2, effectively solve the protein scaffold filling problem by accurately completing incomplete protein sequences. GPT-2 achieved 100% accuracy in filling gaps and determining full sequences for the MabCampth protein scaffold.

Keywords:
convolutional denoising autoencoderde novo protein sequencingtransformer and generative pretrained transformer (GPT)

More Related Videos

A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

68.6K
Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.2K

Related Experiment Videos

Last Updated: Jun 9, 2025

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
10:58

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules

Published on: July 25, 2013

17.0K
A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

68.6K
Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.2K

Area of Science:

  • Proteomics and Bioinformatics
  • Artificial Intelligence in Life Sciences

Background:

  • De novo protein sequencing is vital for understanding protein functions, drug discovery, and evolutionary studies.
  • Mass spectrometry techniques like top-down and bottom-up tandem MS are common but often yield incomplete protein sequences with gaps (scaffolds).
  • The protein scaffold filling problem aims to infer complete protein sequences by filling these gaps.

Purpose of the Study:

  • To address the protein scaffold filling problem using advanced generative AI techniques.
  • To evaluate and compare the performance of various AI models, including convolutional denoising autoencoders, transformers, and GPT models.
  • To assess model efficacy on both real and generated datasets.

Main Methods:

  • Application of generative AI models: convolutional denoising autoencoder, transformer, and generative pretrained transformer (GPT).
  • Comparison with a convolutional long short-term memory (CLSTM)-based sequence model.
  • Performance evaluation using real and generated protein scaffold datasets.

Main Results:

  • All proposed generative AI models demonstrated outstanding prediction accuracy in protein scaffold filling.
  • The GPT-2 model achieved 100% accuracy in both gap-filling and full sequence determination for the MabCampth protein scaffold.
  • GPT-2 significantly outperformed other evaluated models on the MabCampth dataset.

Conclusions:

  • Generative AI, particularly GPT-2, presents a highly effective solution for the protein scaffold filling problem.
  • These AI-driven methods can accurately reconstruct complete protein sequences from incomplete mass spectrometry data.
  • The findings highlight the potential of AI in advancing proteomics research and applications.