Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Recombinant DNA01:09

Recombinant DNA

97.2K
Overview
97.2K
Protein Networks02:26

Protein Networks

4.1K
An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
4.1K
Protein-protein Interfaces02:04

Protein-protein Interfaces

14.0K
Many proteins form complexes to carry out their functions, making protein-protein interactions (PPIs) essential for an organism's survival. Most PPIs are stabilized by numerous weak noncovalent chemical forces. The physical shape of the interfaces determines the way two proteins interact. Many globular proteins have closely-matching shapes on their surfaces, which form a large number of weak bonds. Additionally, many PPIs occur between two helices or between a surface cleft and a...
14.0K
Protein-Protein Interfaces02:04

Protein-Protein Interfaces

4.0K
4.0K
Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

13.1K
Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...
13.1K
Peptide Identification Using Tandem Mass Spectrometry01:33

Peptide Identification Using Tandem Mass Spectrometry

7.2K
Tandem mass spectrometry, also known as MS/MS or MS2, is an analytical technique that employs two mass analyzers. Essentially it is a series of mass spectrometers that helps isolate a particular biomolecule and then helps study its chemical properties.
This technique helps gather information regarding the protein from which the peptide was obtained and to study the peptides’ amino acid sequence. Identifying peptides from a complex mixture is an important component of the growing field of...
7.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Author Correction: Geographics and bacterial networks differently shape the acquired and latent global sewage resistomes.

Nature communications·2026
Same author

Fatigue, neurological, and cognitive symptoms after COVID-19 - a nationwide matched cohort study in Denmark.

Infectious diseases (London, England)·2026
Same author

Time to HIV rebound after infusion of long-acting broadly neutralising antibodies 3BNC117-LS and 10-1074-LS and analytical treatment interruption (the RIO trial): a double-blind, randomised, placebo-controlled trial.

The lancet. HIV·2026
Same author

A comparison of deep multiomics profiles across ethnicity, geography, and age.

Cell·2026
Same author

Early vs late diagnosis in infectious encephalitis: a population-based cohort study.

Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases·2026
Same author

Infectious encephalitis among adults: a prospective and population-based cohort study.

Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases·2026
Same journal

Integrative in silico analysis identifies functionally and regulatively relevant nsSNPs in the TRIB3 gene.

Computational biology and chemistry·2026
Same journal

MARS: Multi-anchor reasoning for reliable toxicity prediction under distribution shift.

Computational biology and chemistry·2026
Same journal

Zadeh-based fuzzy analysis of carreau tri-hybrid nanofluid hemodynamics in a straight artery with irregular triangular stenosis.

Computational biology and chemistry·2026
Same journal

Exploring C<sub>6</sub>N<sub>6</sub> as an effective drug delivery carrier for anticancer drugs mercaptopurine and thiotepa: A DFT and MD approach.

Computational biology and chemistry·2026
Same journal

Role of Artificial Intelligence in bioinformatics: Revolutionizing molecular docking and DNA tokenization.

Computational biology and chemistry·2026
Same journal

An interpretable framework for cancer drug response prediction using integrated drug and multi-omics data with a hybrid Bi-LSTM-GRU network.

Computational biology and chemistry·2026
See all related articles

Related Experiment Video

Updated: Oct 13, 2025

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions
06:50

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions

Published on: January 26, 2024

2.1K

Deep protein representations enable recombinant protein expression prediction.

Hannah-Marie Martiny1, Jose Juan Almagro Armenteros2, Alexander Rosenberg Johansen3

  • 1Research Group for Genomic Epidemiology, National Food Institute, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark.

Computational Biology and Chemistry
|November 14, 2021
PubMed
Summary
This summary is machine-generated.

Developing a machine learning model for Bacillus subtilis improves predicting recombinant protein expression. This aids in selecting compatible genes for industrial enzyme production, saving time and resources.

More Related Videos

Extracellular Protein Microarray Technology for High Throughput Detection of Low Affinity Receptor-Ligand Interactions
06:01

Extracellular Protein Microarray Technology for High Throughput Detection of Low Affinity Receptor-Ligand Interactions

Published on: January 7, 2019

7.4K
A Convenient and General Expression Platform for the Production of Secreted Proteins from Human Cells
07:09

A Convenient and General Expression Platform for the Production of Secreted Proteins from Human Cells

Published on: July 31, 2012

21.4K

Related Experiment Videos

Last Updated: Oct 13, 2025

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions
06:50

Author Spotlight: A Computational Approach to Decipher Amino Acid Preferences in Multispecific Protein-Protein Interactions

Published on: January 26, 2024

2.1K
Extracellular Protein Microarray Technology for High Throughput Detection of Low Affinity Receptor-Ligand Interactions
06:01

Extracellular Protein Microarray Technology for High Throughput Detection of Low Affinity Receptor-Ligand Interactions

Published on: January 7, 2019

7.4K
A Convenient and General Expression Platform for the Production of Secreted Proteins from Human Cells
07:09

A Convenient and General Expression Platform for the Production of Secreted Proteins from Human Cells

Published on: July 31, 2012

21.4K

Area of Science:

  • Biotechnology
  • Molecular Biology
  • Bioinformatics

Background:

  • Recombinant gene expression is vital for industrial enzyme production, aiming for high protein yields in host microbes.
  • Current overexpression strategies involve vector modification, cultivation adjustments, and codon optimization, which are time-consuming.
  • Existing soluble expression prediction tools are optimized for Escherichia coli and do not suit Bacillus subtilis, a key industrial host.

Purpose of the Study:

  • To develop a Bacillus subtilis-specific machine learning model for predicting protein expressibility.
  • To address the limitations of existing prediction tools not accounting for protein solubility and host specificity.
  • To enable efficient selection of genes with high expression potential in Bacillus subtilis.

Main Methods:

  • A machine learning model was trained using a combination of a small labeled dataset and millions of unlabeled proteins.
  • The model was trained on Bacillus subtilis-specific data, incorporating features beyond simple amino acid frequencies.
  • Performance was evaluated using area under the curve (AUC) and Matthews correlation coefficient (MCC).

Main Results:

  • The developed model demonstrates modest predictive performance (AUC 0.64, MCC 0.2) for Bacillus subtilis expressibility.
  • The inclusion of unlabeled proteins significantly improved model performance compared to using only labeled data.
  • Predicted class probabilities correlated with actual protein expression levels, indicating the model captures relevant biological features.

Conclusions:

  • A Bacillus subtilis-specific machine learning model can effectively predict protein expressibility, outperforming general tools.
  • The model is sufficient for prioritizing expression candidates in high-throughput screening for industrial enzyme production.
  • The model captures key features influencing protein expression, including base frequencies and solubility, within the Bacillus subtilis host.