Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Microbial dysbiosis and inferred functional profiling reveals the potential role of <i>Methylobacterium</i> in prostate cancer.

Frontiers in cellular and infection microbiology·2026
Same author

Depletion of Neurocan in the Prefrontal Cortex Impairs Temporal Order Recognition, Cognitive Flexibility and Perisomatic GABAergic Innervation.

Cellular and molecular neurobiology·2026
Same author

Artificial intelligence directed computational protein design: lessons from COVID-19 for pandemic-ready vaccines and antibody therapeutics.

Journal of pharmacy & pharmaceutical sciences : a publication of the Canadian Society for Pharmaceutical Sciences, Societe canadienne des sciences pharmaceutiques·2026
Same author

Ensemble threshold Boolean modeling reveals robust attractors and regulatory drivers in pediatric leukemia.

Computers in biology and medicine·2026
Same author

Epidemiology and Clinical Profile of Brachial Plexus Injuries in Semi-Urban Punjab: Insights From a Retrospective Study at a Tertiary Care Center.

Cureus·2026
Same author

Pharmacological targeting of the NLRP3 LRR domain with isothiazolinones overcomes CRID3-resistant inflammation.

EMBO molecular medicine·2026
Same journal

Correction: A method for supervoxel-wise association studies of age and other non-imaging variables from coronary computed tomography angiograms.

Scientific reports·2026
Same journal

Poly(bromophenol blue)/CoSn(OH)<sub>6</sub> cubic particles modified pencil graphite electrode for electrochemical determination of diphenhydramine.

Scientific reports·2026
Same journal

Dietary Chlorella, Spirulina, and acidifier modulate jejunal cytokine-related gene expression in broiler chickens.

Scientific reports·2026
Same journal

Perceived physical activity barriers in university students: associations with fatigue and eating behaviours.

Scientific reports·2026
Same journal

Refuge limitation structures habitat use in agricultural landscapes: evidence from Sunda pangolins.

Scientific reports·2026
Same journal

Lightweight stateless transaction verification with outsourced witness updates for UTXO blockchains.

Scientific reports·2026
See all related articles

Related Experiment Video

Updated: Jun 1, 2025

Protein Crystallization for X-ray Crystallography
09:27

Protein Crystallization for X-ray Crystallography

Published on: January 16, 2011

63.4K

Benchmarking protein language models for protein crystallization.

Raghvendra Mall1, Rahul Kaushik2, Zachary A Martinez3

  • 1Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates. raghvendra.mall@tii.ae.

Scientific Reports
|January 18, 2025
PubMed
Summary
This summary is machine-generated.

We benchmarked open protein language models (PLMs) for predicting protein crystallization. The ESM2 model embeddings significantly improved prediction accuracy, outperforming existing methods and enabling the design of novel crystallizable proteins.

Keywords:
BenchmarkingOpen protein language models (PLMs)Protein crystallizationProtein generation

More Related Videos

Author Spotlight: High-Throughput Screening to Obtain Crystal Hits for Protein Crystallography
06:19

Author Spotlight: High-Throughput Screening to Obtain Crystal Hits for Protein Crystallography

Published on: March 10, 2023

4.2K
Automated Protocols for Macromolecular Crystallization at the MRC Laboratory of Molecular Biology
11:20

Automated Protocols for Macromolecular Crystallization at the MRC Laboratory of Molecular Biology

Published on: January 24, 2018

16.3K

Related Experiment Videos

Last Updated: Jun 1, 2025

Protein Crystallization for X-ray Crystallography
09:27

Protein Crystallization for X-ray Crystallography

Published on: January 16, 2011

63.4K
Author Spotlight: High-Throughput Screening to Obtain Crystal Hits for Protein Crystallography
06:19

Author Spotlight: High-Throughput Screening to Obtain Crystal Hits for Protein Crystallography

Published on: March 10, 2023

4.2K
Automated Protocols for Macromolecular Crystallization at the MRC Laboratory of Molecular Biology
11:20

Automated Protocols for Macromolecular Crystallization at the MRC Laboratory of Molecular Biology

Published on: January 24, 2018

16.3K

Area of Science:

  • Structural Biology
  • Computational Biology
  • Machine Learning

Background:

  • Protein structure determination is crucial for understanding function, often relying on X-ray crystallography.
  • In silico methods, particularly deep learning, are being developed to predict protein crystallization propensity from sequences, addressing experimental limitations.
  • Protein Language Models (PLMs) offer a powerful approach for learning sequence-based representations.

Purpose of the Study:

  • To benchmark the performance of open protein language models (PLMs) for predicting protein crystallization propensity.
  • To identify the most effective PLMs and machine learning classifiers for this task.
  • To explore the utility of PLMs in generating novel, potentially crystallizable proteins.

Main Methods:

  • Benchmarking various open PLMs (ESM2, Ankh, ProtT5-XL, ProstT5, xTrimoPGLM, SaProt) using LightGBM/XGBoost classifiers on average protein embeddings.
  • Comparing PLM-based methods against state-of-the-art sequence-based predictors (DeepCrystal, ATTCrys, CLPred).
  • Fine-tuning the ProtGPT2 model for de novo protein generation, followed by multi-stage filtration.

Main Results:

  • LightGBM classifiers using ESM2 embeddings (30/36 layers, 150M/3B parameters) demonstrated significant performance gains (up to 3x) across multiple metrics (AUPR, AUC, F1) compared to all other tested models.
  • The study identified the most effective PLMs for predicting protein crystallization outcomes.
  • Five novel, potentially crystallizable proteins were successfully designed and identified through the fine-tuned ProtGPT2 model and rigorous filtration.

Conclusions:

  • Open protein language models, particularly ESM2, show superior performance in predicting protein crystallization propensity.
  • The TRILL platform effectively democratizes the use of PLMs for this critical task.
  • PLM-based generative approaches hold promise for accelerating the discovery of novel proteins with desired properties, such as crystallizability.