Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Conservation of Protein Domains Over Different Proteins02:26

Conservation of Protein Domains Over Different Proteins

Protein domains are small structurally independent units that are part of a single amino acid chain.  Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to form...
Protein Networks02:26

Protein Networks

An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
Protein Networks02:26

Protein Networks

An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
Protein and Protein Structure02:15

Protein and Protein Structure

Proteins are one of the most abundant organic molecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective. They may serve in transport, storage, or membranes; or they may be toxins or enzymes. Their structures, like their functions, vary greatly. They are all, however, amino acid polymers arranged in a linear sequence.
A protein's shape is critical to its function. For example, an enzyme can...
Protein Families02:47

Protein Families

Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key locations, protein...
Protein Families02:47

Protein Families

Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key locations, protein...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Patient-derived glioblastoma cultures preserve respiration phenotypes during ex vivo maintenance and show sex-associated differences in migration.

Acta neuropathologica communications·2026
Same author

Clinical, Dietary, Lifestyle and Genetic Factors Associated With Age at Onset of Esophageal Adenocarcinoma.

United European gastroenterology journal·2026
Same author

Generalization of ML Models Between ECG and VCG Representation.

Studies in health technology and informatics·2026
Same author

DicomShield: A Pseudonymization Proxy for the Secondary Use of Imaging Data in the Research Context.

Studies in health technology and informatics·2026
Same author

Harnessing generative AI for predicting and optimizing antimicrobial peptides against drug-resistant infections.

npj antimicrobials and resistance·2026
Same author

dcFCI: Robust Causal Discovery Under Latent Confounding, Unfaithfulness, and Mixed Data.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Modulation of brain-kidney crosstalk by olanzapine in aluminum chloride-induced memory impairment: a preclinical investigation.

BMC research notes·2026
Same journal

Tagged and untagged amyloid precursor protein E2 domain have comparable thermal stability and metal-ion binding propensity.

BMC research notes·2026
Same journal

Phenotypical and functional characterization of a HepG2 cell clone stably overexpressing cytochrome P450 (CYP) 2C9.

BMC research notes·2026
Same journal

Inefficacy of a novel osmotic associative learning assay in C. elegans.

BMC research notes·2026
Same journal

Anticancer proteasome inhibitors are detrimental to the growth of Toxoplasma gondii in vitro.

BMC research notes·2026
Same journal

Body mass index, nutritional knowledge, and eating attitudes in dancer and non-dancer students.

BMC research notes·2026
See all related articles

Related Experiment Video

Updated: Jun 3, 2026

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

Machine learning on normalized protein sequences.

Dominik Heider1, Jens Verheyen, Daniel Hoffmann

  • 1Department of Bioinformatics, Center of Medical Biotechnology, University of Duisburg-Essen, Universitaetsstr, 2, 45117 Essen, Germany. dominik.heider@uni-due.de.

BMC Research Notes
|April 2, 2011
PubMed
Summary
This summary is machine-generated.

Normalizing biological sequences to uniform length using linear interpolation improves machine learning predictions for tasks like HIV-1 drug resistance and protein function classification.

More Related Videos

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Related Experiment Videos

Last Updated: Jun 3, 2026

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Machine Learning

Background:

  • Machine learning is applied to biological sequences for tasks like predicting HIV-1 drug resistance and protein function.
  • Current methods struggle with variable sequence lengths due to insertions and deletions.

Purpose of the Study:

  • To address the limitation of varying sequence lengths in biological sequence analysis.
  • To evaluate sequence normalization techniques for machine learning applications.

Main Methods:

  • Tested linear and non-linear interpolation methods for sequence length normalization.
  • Applied random forests for classification on 19 datasets.
  • Included tasks like HIV-1 drug resistance prediction and protein function prediction.

Main Results:

  • Linear interpolation outperformed non-linear methods on most datasets.
  • Non-linear methods showed a small advantage in specific cases.
  • The proposed prediction scheme improved accuracy by up to 14% compared to existing methods.

Conclusions:

  • Machine learning on linearly normalized sequences provides competitive or superior results.
  • Simple linear interpolation is a promising alternative for analyzing variable-length protein sequences.