Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Protein Families02:47

Protein Families

15.6K
Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key...
15.6K
Conserved Binding Sites01:49

Conserved Binding Sites

4.3K
Many proteins’ biological role depends on their interactions with their ligands, small molecules that bind to specific locations on the protein known as ligand-binding sites. Ligand-binding sites are often conserved among homologous proteins as these sites are critical for protein function.
Binding sites are often located in large pockets, and if their location on a protein’s surface is unknown, it can be predicted using various approaches. The energetic method computationally...
4.3K
Protein-protein Interfaces02:04

Protein-protein Interfaces

12.6K
Many proteins form complexes to carry out their functions, making protein-protein interactions (PPIs) essential for an organism's survival. Most PPIs are stabilized by numerous weak noncovalent chemical forces. The physical shape of the interfaces determines the way two proteins interact. Many globular proteins have closely-matching shapes on their surfaces, which form a large number of weak bonds. Additionally, many PPIs occur between two helices or between a surface cleft and a...
12.6K
Protein Networks02:26

Protein Networks

4.1K
An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
4.1K
Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

11.1K
Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...
11.1K
Protein Organization01:24

Protein Organization

6.8K
Proteins are polymers of amino acid residues. They are versatile and responsible for different cellular functions, including DNA replication, molecular transport, catalysis, and structural support. Proteins have a hierarchical structure comprising at least three levels of organization: primary, secondary, and tertiary structure. Some large proteins have a quaternary structure where individual protein subunits are linked together.
The primary structure of a protein is its amino acid sequence....
6.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

SVAtlas: a comprehensive single extracellular vesicle omics resource.

Nucleic acids research·2025
Same author

Comparison of sequence- and structure-based antibody clustering approaches on simulated repertoire sequencing data.

PLoS computational biology·2025
Same author

Data-driven evaluation of suitable immunogens for improved antibody selection.

Protein science : a publication of the Protein Society·2025
Same author

PIPENN-EMB ensemble net and protein embeddings generalise protein interface prediction beyond homology.

Scientific reports·2025
Same author

Impact of pathogenic mutations of the GLUT1 glucose transporter on solute carrier dynamics using ComDYN enhanced sampling.

F1000Research·2025
Same author

PatchProt: hydrophobic patch prediction using protein foundation models.

Bioinformatics advances·2024
Same journal

Systematic design of auxotrophic strains and media conditions to probe metabolic functions in E. coli.

PLoS computational biology·2026
Same journal

Neuronal excitability and parameter variability in the Hodgkin-Huxley model.

PLoS computational biology·2026
Same journal

Delayed reward information is underweighted in reinforcement learning with dispersed feedback.

PLoS computational biology·2026
Same journal

GHF-ACL: A novel contrastive learning framework with multi-order graph structures for herb-disease association prediction.

PLoS computational biology·2026
Same journal

GATE: Adaptive learning with working memory by information gating in multi-lamellar hippocampal formation.

PLoS computational biology·2026
Same journal

Evaluating vectors for the design of a spillover-disrupting Lassa virus transmissible vaccine.

PLoS computational biology·2026
See all related articles

Related Experiment Video

Updated: Aug 19, 2025

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions
06:50

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions

Published on: January 26, 2024

2.0K

Ten quick tips for sequence-based prediction of protein properties using machine learning.

Qingzhen Hou1,2, Katharina Waury3, Dea Gogishvili3

  • 1Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong, P. R. China.

Plos Computational Biology
|December 1, 2022
PubMed
Summary
This summary is machine-generated.

This study addresses common issues in machine learning for protein property prediction. It provides guidance for computational biologists to improve clarity and rigor in their methods, ensuring reproducibility.

More Related Videos

A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

68.8K
Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

257

Related Experiment Videos

Last Updated: Aug 19, 2025

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions
06:50

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions

Published on: January 26, 2024

2.0K
A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

68.8K
Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

257

Area of Science:

  • Computational biology
  • Bioinformatics
  • Machine learning applications in life sciences

Background:

  • Genome sequencing data is widely available, driving the popularity of machine learning (ML) for protein property prediction.
  • Recurring issues in published research hinder understanding and replication of ML-based protein studies.
  • A gap exists between biologists' domain knowledge and ML experts' methodological application to proteins.

Purpose of the Study:

  • To bridge the knowledge gap between biologists and ML experts in protein property prediction.
  • To provide clear guidelines for developing and reporting ML methods in computational biology.
  • To improve the clarity, rigor, and reproducibility of ML-based biological research.

Main Methods:

  • Analysis of recurring issues in published manuscripts and pre-prints.
  • Identification of common pitfalls in applying ML to protein sequence data.
  • Synthesis of best practices for clarity, rigor, and benchmark comparisons.

Main Results:

  • Identified critical issues related to annotation clarity (source, metrics, definitions).
  • Highlighted problems concerning methodological rigor (e.g., use of structural information, benchmark selection, statistical significance).
  • Detailed specific recommendations for developers to avoid common errors.

Conclusions:

  • Clear reporting of data sources, metrics, and definitions is crucial for ML studies.
  • Maintaining methodological rigor, including avoiding hidden data sources and using appropriate comparisons, is essential.
  • Adherence to these guidelines will enhance the understanding, replicability, and impact of ML applications in computational biology.