Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Protein Organization01:24

Protein Organization

6.7K
Proteins are polymers of amino acid residues. They are versatile and responsible for different cellular functions, including DNA replication, molecular transport, catalysis, and structural support. Proteins have a hierarchical structure comprising at least three levels of organization: primary, secondary, and tertiary structure. Some large proteins have a quaternary structure where individual protein subunits are linked together.
The primary structure of a protein is its amino acid sequence....
6.7K
Protein Folding01:22

Protein Folding

118.8K
Overview
118.8K
Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

11.1K
Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...
11.1K
Protein and Protein Structure02:15

Protein and Protein Structure

80.1K
Proteins are one of the most abundant organic molecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective. They may serve in transport, storage, or membranes; or they may be toxins or enzymes. Their structures, like their functions, vary greatly. They are all, however, amino acid polymers arranged in a linear sequence.
A protein's shape is critical to its function. For example, an enzyme...
80.1K
Protein-protein Interfaces02:04

Protein-protein Interfaces

12.6K
Many proteins form complexes to carry out their functions, making protein-protein interactions (PPIs) essential for an organism's survival. Most PPIs are stabilized by numerous weak noncovalent chemical forces. The physical shape of the interfaces determines the way two proteins interact. Many globular proteins have closely-matching shapes on their surfaces, which form a large number of weak bonds. Additionally, many PPIs occur between two helices or between a surface cleft and a...
12.6K
Protein and Protein Structures02:15

Protein and Protein Structures

10.7K
10.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

De novo design and experimental characterization of bitter peptides.

NPJ science of food·2026
Same author

On the state of protein function prediction: a report on the fourth CAFA challenge.

bioRxiv : the preprint server for biology·2026
Same author

Steering generative models for protein design: Aligning and conditioning strategies.

Current opinion in structural biology·2026
Same author

Black-box data: a new paradigm for biomedicine in the AI era.

Chemical science·2026
Same author

Computational or experimental research? Yes to both!

Molecular cell·2026
Same author

Protein structure-informed bacteriophage genome annotation with Phold.

Nucleic acids research·2026
Same journal

From Pixels to Patterns: A Multidimensional Framework to Decode Cytoskeletal Organization.

Computational and structural biotechnology journal·2026
Same journal

A Large Concept Model for Mechanistic Simulation of Disease Trajectories: A Hypothesis-Generating Exemplar for Pediatric Acute Lymphoblastic Leukemia.

Computational and structural biotechnology journal·2026
Same journal

Adversarial Sequence Mutations in AlphaFold and ESMFold Reveal Nonphysical Structural Invariance, Confidence Failures, and Concerns for Protein Design.

Computational and structural biotechnology journal·2026
Same journal

High-Throughput Prediction of Protein-Protein Interactions Uncovers Hidden Molecular Networks in Biosynthetic Gene Clusters.

Computational and structural biotechnology journal·2026
Same journal

A Region-Aware Structured Framework Improves Prediction of Gene Expression from DNA Methylation.

Computational and structural biotechnology journal·2026
Same journal

Ensemble Machine Learning Approaches Predict Survival in Lower-Grade Glioma Based on Glycosphingolipid Gene Expression and Metabolic Modeling.

Computational and structural biotechnology journal·2026
See all related articles

Related Experiment Video

Updated: Aug 16, 2025

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
10:58

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules

Published on: July 25, 2013

17.1K

From sequence to function through structure: Deep learning for protein design.

Noelia Ferruz1,2, Michael Heinzinger3, Mehmet Akdel4

  • 1Institute of Informatics and Applications, University of Girona, Girona, Spain.

Computational and Structural Biotechnology Journal
|December 22, 2022
PubMed
Summary
This summary is machine-generated.

This article reviews how artificial intelligence is transforming the creation of new proteins. It provides a guide to modern computational tools that generate protein sequences and predicts their functions. The authors also demonstrate a workflow for designing a protein that could help create specific chemical compounds.

Keywords:
ADMM, Alternating Direction Method of MultipliersCNN, Convolutional Neural NetworkDL, Deep learningDeep learningDrug discoveryFNN, fully-connected neural networkGAN, Generative Adversarial NetworkGCN, Graph Convolutional NetworkGNN, Graph Neural NetworkGO, Gene OntologyGVP, Geometric Vector PerceptronLSTM, Long-Short Term MemoryMLP, Multilayer PerceptronMSA, Multiple Sequence AlignmentNLP, Natural Language ProcessingNSR, Natural Sequence RecoveryProtein designProtein language modelsProtein predictionVAE, Variational AutoencoderpLM, protein Language Modelgenerative modelsprotein engineeringcomputational biologysynthetic biology

Frequently Asked Questions

More Related Videos

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

237
Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web
09:51

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

15.5K

Related Experiment Videos

Last Updated: Aug 16, 2025

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
10:58

Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules

Published on: July 25, 2013

17.1K
Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

237
Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web
09:51

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

15.5K

Area of Science:

  • Computational biology research within deep learning for protein design
  • Bioinformatics and structural biology integration

Background:

No prior work has fully resolved the complexity of navigating modern computational tools for engineering novel biomolecules. Traditional methods relied heavily on physicochemical force fields to predict stable structures. This gap motivated a shift toward end-to-end differentiable statistical models for faster sequence generation. Prior research has shown that artificial intelligence offers powerful capabilities for learning intricate biological patterns. That uncertainty drove the integration of natural language processing and computer vision into structural biology. Researchers now leverage massive databases to train models that predict functional protein properties. However, the rapid proliferation of these diverse software packages creates significant barriers for practitioners. This review addresses the urgent need to synthesize recent progress in the field.

Purpose Of The Study:

The aim of this article is to document recent progress in computational tools for designing novel proteins. The authors seek to address the complexity practitioners face when navigating the rapidly evolving landscape of software. They intend to provide a practical guide that simplifies the transition from theoretical sequences to functional predictions. This work motivates researchers to adopt modern statistical models over traditional physics-based approaches. The team aims to demonstrate the efficacy of their proposed pipeline through a concrete application example. They want to clarify how artificial intelligence can be effectively integrated into structural biology workflows. By synthesizing recent literature, they hope to highlight both the potential and the limitations of current technologies. This study serves as a resource for those looking to implement advanced design strategies in their own research.

Main Methods:

The authors perform a systematic review of computational advancements published over the past three years. They document various software frameworks to categorize the current landscape of generative modeling. Their review approach involves evaluating the efficacy of end-to-end differentiable statistical architectures. They establish a standardized pipeline that connects sequence generation to structural prediction modules. This workflow incorporates web-powered visualization tools to facilitate immediate analysis of predicted molecular properties. The team tests this integrated system by generating a novel sequence for a specific biosynthetic application. They compare this automated strategy against legacy methods that utilized traditional physicochemical force fields. This methodology provides a clear roadmap for practitioners to implement state-of-the-art design techniques.

Main Results:

The authors identify that generative models now allow for the rapid production of complex protein sequences. Their key findings from the literature demonstrate that integrating natural language processing significantly improves control over design parameters. The proposed pipeline successfully generates a candidate sequence for engineering a biosynthetic gene cluster. This specific protein is predicted to assist in the synthesis of a molecular glue-like compound. The review shows that these tools can provide functional insights within minutes of initial input. They report that the transition from force-field-based approaches to statistical models has increased design speed. The literature confirms that curated biological annotations are vital for training high-performing models. Their analysis highlights that current software suites offer unprecedented capabilities for structural prediction and functional design.

Conclusions:

The authors synthesize recent progress to clarify how machine learning models improve protein engineering workflows. They suggest that integrating sequence generation with structural prediction provides a robust framework for discovery. Their review highlights that current pipelines allow for rapid iteration from initial design to functional validation. The researchers propose that these computational advancements enable the creation of complex molecular architectures previously considered unattainable. They argue that the field must now focus on addressing existing limitations in model interpretability and data quality. The authors emphasize that future success depends on bridging the gap between theoretical predictions and experimental verification. Their synthesis indicates that deep learning will continue to redefine the boundaries of synthetic biology. The team concludes that standardized workflows are necessary to maximize the utility of these emerging technologies.

The authors propose a pipeline that integrates de novo sequence generation with structural prediction and web-based visualization. This workflow allows researchers to transition from raw sequence data to functional property insights within minutes, significantly accelerating the traditional design cycle compared to force-field-based methods.

The researchers utilize natural language processing and computer vision architectures. These computational frameworks are adapted from artificial intelligence to identify complex patterns within biological databases, which differs from traditional physics-based simulations that rely on explicit energy calculations.

The authors suggest that high-performance computing hardware is necessary to process the vast biological databases and curated annotations. This infrastructure allows for the training of deep learning models, which is a requirement for generating plausible sequences that exceed the capabilities of manual design.

The researchers use deep learning models to process sequence data and predict structural properties. This data type serves as the foundation for their proposed pipeline, enabling the generation of novel sequences that can be tested for specific biosynthetic applications.

The authors demonstrate the utility of their approach by suggesting a protein sequence potentially capable of engineering a biosynthetic gene cluster. This measurement of success involves predicting the protein's ability to facilitate the production of a molecular glue-like compound.

The researchers propose that the field must focus on overcoming current challenges in model interpretability. They suggest that addressing these hurdles will unlock new opportunities for engineering complex molecular systems, contrasting this with the current reliance on black-box predictive models.