Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Protein Organization01:24

Protein Organization

6.6K
Proteins are polymers of amino acid residues. They are versatile and responsible for different cellular functions, including DNA replication, molecular transport, catalysis, and structural support. Proteins have a hierarchical structure comprising at least three levels of organization: primary, secondary, and tertiary structure. Some large proteins have a quaternary structure where individual protein subunits are linked together.
The primary structure of a protein is its amino acid sequence....
6.6K
Protein and Protein Structures02:15

Protein and Protein Structures

10.6K
10.6K
Protein and Protein Structure02:15

Protein and Protein Structure

79.7K
Proteins are one of the most abundant organic molecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective. They may serve in transport, storage, or membranes; or they may be toxins or enzymes. Their structures, like their functions, vary greatly. They are all, however, amino acid polymers arranged in a linear sequence.
A protein's shape is critical to its function. For example, an enzyme...
79.7K
Protein Families02:47

Protein Families

15.4K
Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key...
15.4K
Globular and Fibrous Proteins02:21

Globular and Fibrous Proteins

43.8K
Many proteins can be classified into two distinct subtypes - globular or fibrous. These two types differ in their shapes and solubilities.
Globular proteins are also known as spheroproteins and typically are approximately round in shape. They contain a mix of amino acid types and contain differing sequences in their primary structures. Globular proteins have many different functions, such as enzymes, cellular messengers, and molecular transporters. These roles often require the proteins to be...
43.8K
Protein Networks02:26

Protein Networks

2.4K
2.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Author Correction: Biophysical prediction of protein-peptide interactions and signaling networks using machine learning.

Nature methods·2026
Same author

Fast and Ultra-Capable Protein Design: Advancing the Frontier Through Atomistic SE(3)-Equivariance with Genie 3.

bioRxiv : the preprint server for biology·2026
Same author

On the state of protein function prediction: a report on the fourth CAFA challenge.

bioRxiv : the preprint server for biology·2026
Same author

ConforNets: Latents-Based Conformational Control in OpenFold3.

ArXiv·2026
Same author

GREmLN: A Cellular Graph Structure Aware Transcriptomics Foundation Model.

bioRxiv : the preprint server for biology·2026
Same author

Corrigendum to "Development of analytical "aroma wheels" for Oolong tea infusions (Shuixian and Rougui) and prediction of dynamic aroma release and colour changes during "Chinese tea ceremony" with machine learning" [Food Chem. 464 (2025) 141537].

Food chemistry·2026
Same journal

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models.

ArXiv·2026
Same journal

Mechanistic mathematical model of the in vitro infection dynamics of Bunyamwera and Batai viruses including MOI-dependent shortening of the eclipse phase.

ArXiv·2026
Same journal

AI-Driven Lumped-Element Modeling of Human Respiratory System for Studying Voice Mechanics.

ArXiv·2026
Same journal

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI.

ArXiv·2026
Same journal

Feynman Kac Reweighted Schrödinger Bridge Matching for Surface-Based Tau PET Harmonization.

ArXiv·2026
Same journal

Agentic Discovery of Non-Canonical Antimicrobial Peptides with AMPGAN v3.

ArXiv·2026
See all related articles

Related Experiment Video

Updated: Jul 18, 2025

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

91

OpenProteinSet: Training data for structural biology at scale.

Gustaf Ahdritz1, Nazim Bouatta2, Sachin Kadyan3

  • 1Harvard University.

Arxiv
|August 23, 2023
PubMed
Summary
This summary is machine-generated.

Researchers created OpenProteinSet, a large, open-source dataset of protein sequence alignments and structures. This resource aids machine learning in protein science by providing essential training data for tasks like protein design and structure prediction.

More Related Videos

A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

68.7K
Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web
09:51

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

15.5K

Related Experiment Videos

Last Updated: Jul 18, 2025

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins
05:08

Application of I TASSER, trRosetta, UCSF Chimera, HADDOCK server, and HEX loria for De Novo and In Silico Design of Proteins

Published on: July 8, 2025

91
A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

68.7K
Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web
09:51

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

15.5K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Machine Learning for Proteins

Background:

  • Multiple sequence alignments (MSAs) are crucial for protein design and structure prediction.
  • Recent advances like AlphaFold2 highlight the importance of large-scale MSAs.
  • Generating MSAs is computationally expensive, limiting data availability for research.

Conclusions:

  • OpenProteinSet provides essential training and validation data for protein science.
  • The dataset supports diverse research tasks, including protein structure, function, and design.
  • It is expected to advance large-scale multimodal machine learning research in proteomics.