Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

15.1K
Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to...
15.1K
Conserved Binding Sites01:49

Conserved Binding Sites

5.3K
Many proteins’ biological role depends on their interactions with their ligands, small molecules that bind to specific locations on the protein known as ligand-binding sites. Ligand-binding sites are often conserved among homologous proteins as these sites are critical for protein function.
Binding sites are often located in large pockets, and if their location on a protein’s surface is unknown, it can be predicted using various approaches. The energetic method computationally...
5.3K
Conservation of Protein Domains02:26

Conservation of Protein Domains

4.4K
4.4K
Enzymes02:34

Enzymes

97.3K
Inside living organisms, enzymes act as catalysts for many biochemical reactions involved in cellular metabolism. The role of enzymes is to reduce the activation energies of biochemical reactions by forming complexes with its substrates. The lowering of activation energies favor an increase in the rates of biochemical reactions.
Enzyme deficiencies can often translate into life-threatening diseases. For example, a genetic abnormality resulting in the deficiency of the enzyme G6PD...
97.3K
Protein Folding Quality Check in the RER01:29

Protein Folding Quality Check in the RER

5.7K
ER is the primary site for the maturation and folding of soluble and transmembrane secretory proteins. The calnexin cycle is a specific chaperone system that folds and assesses the confirmation of N-glycosylated proteins before they can exit the ER lumen. The primary players of this quality check pipeline are the lectins, ER-resident chaperones, and a glucosyl transferase enzyme. In case the calnexin system in the lumen fails to salvage a misfolded protein, it is transported to the cytoplasm...
5.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

<i>De Novo</i> Biosynthesis of Neoxanthin in Engineered <i>Escherichia coli</i>.

ACS synthetic biology·2025
Same author

Unraveling the Molecular Determinants of Catalytic Efficiency and Substrate Specificity in l-Amino Acid Decarboxylases.

Journal of agricultural and food chemistry·2024
Same author

Engineering Metabolic Pathways for Cofactor Self-Sufficiency and Serotonin Production in <i>Escherichia coli</i>.

ACS synthetic biology·2022
Same author

A SiO<sub>2</sub> Microcarrier with an Opal-like Structure for Cross-Linked Enzyme Immobilization.

Langmuir : the ACS journal of surfaces and colloids·2021
Same author

High-Level Production of Indole-3-acetic Acid in the Metabolically Engineered <i>Escherichia coli</i>.

Journal of agricultural and food chemistry·2021
Same author

Carotenoids and lipid production from <i>Rhodosporidium toruloides</i> cultured in tea waste hydrolysate.

Biotechnology for biofuels·2020
Same journal

QSAR in the Browser: An Interactive Cheminformatics Web Application.

Journal of chemical information and modeling·2026
Same journal

FoldDoF: Utilizing the Primary Degrees of Freedom of Protein Backbone for Geometric Modeling and Generation.

Journal of chemical information and modeling·2026
Same journal

Derisking Affinity Optimization for Macrocycles and Cyclic Peptides: High-Precision Free Energy Simulations across Five Diverse Targets.

Journal of chemical information and modeling·2026
Same journal

An End-User Audit of Reproducibility, Data Leakage, and Overfitting of the Top-Ranked ADMET Prediction Models in TDC Leaderboards.

Journal of chemical information and modeling·2026
Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026
Same journal

DeepKbhb: Context-Aware Prediction of Human Lysine β-Hydroxybutyrylation Sites.

Journal of chemical information and modeling·2026
See all related articles

Related Experiment Video

Updated: Apr 8, 2026

A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

70.1K

EC-Design: A Robust Framework for Enzyme Function Prediction Using Dimensionality-Reduced Sequence Features.

Huanghui Xia1,2, Huangzhi Xia3,4, Feng Qi1,2

  • 1College of Life Sciences, Fujian Normal University, Fuzhou, Fujian 350117, China.

Journal of Chemical Information and Modeling
|April 7, 2026
PubMed
Summary
This summary is machine-generated.

EC-Design, a machine learning framework, accurately classifies enzymes using principal component analysis (PCA) and Fisher Score feature selection. Instance-based learning with k-nearest neighbors (k-NN) outperforms complex models for enzyme annotation.

More Related Videos

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web
09:51

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

16.2K
Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.8K

Related Experiment Videos

Last Updated: Apr 8, 2026

A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

70.1K
Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web
09:51

Investigating Protein Sequence-structure-dynamics Relationships with Bio3D-web

Published on: July 16, 2017

16.2K
Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues
07:08

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

7.8K

Area of Science:

  • Bioinformatics
  • Machine Learning
  • Enzyme Catalysis

Background:

  • Automated enzyme classification faces challenges due to high-dimensional data and imbalanced datasets.
  • Existing methods often struggle with accuracy and interpretability in large-scale enzyme annotation.

Purpose of the Study:

  • To develop an efficient and accurate machine learning framework, EC-Design, for automated enzyme classification.
  • To evaluate the performance of instance-based learning against ensemble methods for enzyme annotation.

Main Methods:

  • Utilized Principal Component Analysis (PCA) and Fisher Score for feature selection on 134,153 enzyme sequences.
  • Implemented and compared k-nearest neighbors (k-NN) against six other machine learning algorithms.
  • Analyzed model generalization, discriminative power (AUC), and class-specific performance metrics.

Main Results:

  • k-nearest neighbors (k-NN) achieved a top accuracy of 74.59% and a macro-F1 score of 0.6859, outperforming ensemble methods.
  • The model demonstrated robust generalization (74.37 ± 0.49%) and excellent discriminative power (mean AUC = 0.937).
  • Identified dipeptide patterns containing asparagine and glycine as key discriminative features, offering biological interpretability.

Conclusions:

  • EC-Design framework establishes instance-based learning as an effective approach for large-scale enzyme annotation.
  • The study provides a transparent and accurate alternative to complex ensemble models for enzyme classification.
  • Identified key sequence features that provide biologically interpretable insights into enzyme function.