Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Semi-supervised protein classification using cluster kernels.

Jason Weston1, Christina Leslie, Eugene Ie

  • 1NEC Research Institute, 4 Independence Way, Princeton, NJ 08540, USA. jasonw@nec-labs.com

Bioinformatics (Oxford, England)
|May 21, 2005
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Author Correction: Ontogeny and transcriptional regulation of Thetis cells.

Nature·2026
Same author

Prioritizing peptides for targeted mass spectrometry experiments using deep learning.

bioRxiv : the preprint server for biology·2026
Same author

Embryo-scale Visual Cell Sorting reveals a conserved transcriptomic signature of nucleolar size linked to proteostasis.

bioRxiv : the preprint server for biology·2026
Same author

Prediction and functional interpretation of inter-chromosomal genome architecture from DNA sequence with TwinC.

Nature communications·2026
Same author

Postmitotic transcription and 3D regulation show locus-specific and differentiation-specific sensitivity to cohesin depletion.

Nature genetics·2026
Same author

Benchmarking Hi-C scaffolders using reference genomes and de novo assemblies.

Genome biology·2026
Same journal

MCFST: Spatial domain identification method based on multi-view graph convolutional network and graph fusion network.

Bioinformatics (Oxford, England)·2026
Same journal

SpaBiT: Enhancing Spatial Transcriptomics Resolution via Bidirectional Attention Transformers.

Bioinformatics (Oxford, England)·2026
Same journal

EDEL: Enhancing Dense Retrievers for Curation of Biomedical Knowledge Bases.

Bioinformatics (Oxford, England)·2026
Same journal

Informative Relational Learning for Adverse Reaction Prediction with Enhanced Generalization to Novel Drugs.

Bioinformatics (Oxford, England)·2026
Same journal

An interpretable deep learning framework uncovers features governing CRISPR-Cas9 genome-editing efficiency.

Bioinformatics (Oxford, England)·2026
Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026
See all related articles

This study introduces cluster kernel techniques to enhance protein sequence representation using unlabeled data. These methods improve protein classification accuracy and computational efficiency compared to existing approaches.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Structural Bioinformatics

Background:

  • Accurate protein classification relies on effective amino acid sequence representation.
  • String kernels achieve state-of-the-art performance but primarily use labeled data.
  • Unlabeled protein sequence data is significantly more abundant than labeled data.

Purpose of the Study:

  • To develop scalable cluster kernel techniques for incorporating unlabeled data into protein sequence representation.
  • To improve the classification performance of existing string kernel methods.
  • To offer a computationally efficient alternative for utilizing unlabeled protein data.

Main Methods:

  • Development of novel cluster kernel techniques.
  • Integration of unlabeled protein sequence data into sequence representation.

Related Experiment Videos

  • Comparative analysis against standard methods and existing cluster kernel approaches.
  • Main Results:

    • Demonstrated significant improvement in protein classification performance.
    • Outperformed standard methods for utilizing unlabeled data, including adding close homologs.
    • Achieved performance equal to or superior to previous cluster kernel methods with enhanced computational efficiency.

    Conclusions:

    • Cluster kernel techniques effectively leverage unlabeled data for improved protein sequence representation and classification.
    • The proposed methods offer a scalable and computationally efficient solution for protein classification.
    • This work advances the utilization of large unlabeled biological datasets in machine learning applications.