Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Protein Networks02:26

Protein Networks

An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
Protein Networks02:26

Protein Networks

An organism can have thousands of different proteins, and these proteins must cooperate to ensure the health of an organism. Proteins bind to other proteins and form complexes to carry out their functions. Many proteins interact with multiple other proteins creating a complex network of protein interactions.
These interactions can be represented through maps depicting protein-protein interaction networks, represented as nodes and edges. Nodes are circles that are representative of a protein,...
Protein-protein Interfaces02:04

Protein-protein Interfaces

Many proteins form complexes to carry out their functions, making protein-protein interactions (PPIs) essential for an organism's survival. Most PPIs are stabilized by numerous weak noncovalent chemical forces. The physical shape of the interfaces determines the way two proteins interact. Many globular proteins have closely-matching shapes on their surfaces, which form a large number of weak bonds. Additionally, many PPIs occur between two helices or between a surface cleft and a polypeptide...
Protein Families02:47

Protein Families

Protein families are groups of homologous proteins; that is, they have similarities in amino acid sequences and three-dimensional structures. Protein families usually occur because of gene duplication, where an additional copy of a gene is inserted into the genome of an organism.   Mutations that change the amino acids but still allow the protein to be properly synthesized, will lead to new protein family members.   If these new proteins contain similar amino acids in key locations, protein...
Proteomics01:33

Proteomics

A proteome is the entire set of proteins that a cell type produces. We can study proteomes using the knowledge of genomes because genes code for mRNAs, and the mRNAs encode proteins. Although mRNA analysis is a step in the right direction, not all mRNAs are translated into proteins.
Proteomics is the study of proteomes' function. It involves the large-scale systematic study of the proteome to denote the protein complement expressed by a genome. Scientist Mark Wilkins coined the term proteomics...
Conserved Binding Sites01:49

Conserved Binding Sites

Many proteins’ biological role depends on their interactions with their ligands, small molecules that bind to specific locations on the protein known as ligand-binding sites. Ligand-binding sites are often conserved among homologous proteins as these sites are critical for protein function.
Binding sites are often located in large pockets, and if their location on a protein’s surface is unknown, it can be predicted using various approaches. The energetic method computationally analyses the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Concordance Analysis Between KDIGO Definition of Acute Kidney Injury and Its Coding in Clinical Practice.

Kidney medicine·2026
Same author

Precuneus hyperexcitability mediates inflammatory-driven pain hypersensitivity following sleep disruption: a multimodal neuroimaging study.

Frontiers in immunology·2026
Same author

CAMK2D isoform 15 facilitates gefitinib resistance <i>via</i> AKT phosphorylation in lung adenocarcinoma.

Cancer biology & medicine·2026
Same author

On the state of protein function prediction: a report on the fourth CAFA challenge.

bioRxiv : the preprint server for biology·2026
Same author

Epidemiological and Genomic Characterization of Japanese Encephalitis Virus in Mosquitoes in China.

Journal of medical virology·2026
Same author

Network meta-analysis of different acupuncture methods for post-stroke upper-limb spasticity.

Frontiers in neurology·2026
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: May 13, 2026

Computational Prediction of Amino Acid Preferences of Potentially Multispecific Peptide-Binding Domains Involved in Protein-Protein Interactions
06:50

Computational Prediction of Amino Acid Preferences of Potentially Multispecific Peptide-Binding Domains Involved in Protein-Protein Interactions

Published on: January 26, 2024

MS-kNN: protein function prediction by integrating multiple data sources.

Liang Lan1, Nemanja Djuric, Yuhong Guo

  • 1Department of Computer and Information Sciences, Temple University, Philadelphia, PA 19122, USA.

BMC Bioinformatics
|March 22, 2013
PubMed
Summary
This summary is machine-generated.

The Multi-Source k-Nearest Neighbor (MS-kNN) algorithm effectively predicts protein function by integrating sequence, interaction, and expression data. This approach improves accuracy and offers a cost-effective alternative to experimental methods for protein function determination.

More Related Videos

A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

Related Experiment Videos

Last Updated: May 13, 2026

Computational Prediction of Amino Acid Preferences of Potentially Multispecific Peptide-Binding Domains Involved in Protein-Protein Interactions
06:50

Computational Prediction of Amino Acid Preferences of Potentially Multispecific Peptide-Binding Domains Involved in Protein-Protein Interactions

Published on: January 26, 2024

A Protocol for Computer-Based Protein Structure and Function Prediction
16:41

A Protocol for Computer-Based Protein Structure and Function Prediction

Published on: November 3, 2011

An Integrated Approach for Microprotein Identification and Sequence Analysis
09:37

An Integrated Approach for Microprotein Identification and Sequence Analysis

Published on: July 12, 2022

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Protein function determination is a critical challenge in the post-genomic era.
  • Experimental methods for determining protein function are accurate but resource-intensive.
  • Computational approaches offer a cost-effective alternative for predicting protein function.

Purpose of the Study:

  • To introduce and evaluate the Multi-Source k-Nearest Neighbor (MS-kNN) algorithm for protein function prediction.
  • To assess the effectiveness of integrating multiple data sources for improved prediction accuracy.
  • To compare MS-kNN performance against baseline algorithms.

Main Methods:

  • Developed the MS-kNN algorithm, which identifies k-nearest neighbors based on diverse similarity measures.
  • Utilized three data sources for similarity calculations: sequence similarity, protein-protein interactions, and gene expression.
  • Employed weighted averaging of neighbor functions for final prediction.

Main Results:

  • MS-kNN achieved an Area Under the Curve (AUC) of 0.848 for Gene Ontology (GO) molecular function prediction by integrating three data sources.
  • Performance was evaluated in the context of the Critical Assessment of Function Annotation (CAFA) 2011.
  • MS-kNN demonstrated higher accuracy than baseline algorithms like Gotcha and BLAST, which rely solely on sequence similarity.

Conclusions:

  • The k-nearest neighbor algorithm is an efficient and effective model for protein function prediction.
  • Transferring functional information across diverse organisms enhances prediction capabilities.
  • Integrating multiple sources of protein data significantly benefits function prediction accuracy.