Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structure.

Darrin P Lewis1, Tony Jebara, William Stafford Noble

  • 1Department of Computer Science, Columbia University, New York, NY, 10027.

Bioinformatics (Oxford, England)
|September 13, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Prioritizing peptides for targeted mass spectrometry experiments using deep learning.

bioRxiv : the preprint server for biology·2026
Same author

Embryo-scale Visual Cell Sorting reveals a conserved transcriptomic signature of nucleolar size linked to proteostasis.

bioRxiv : the preprint server for biology·2026
Same author

Prediction and functional interpretation of inter-chromosomal genome architecture from DNA sequence with TwinC.

Nature communications·2026
Same author

Benchmarking Hi-C scaffolders using reference genomes and de novo assemblies.

Genome biology·2026
Same author

Unified imputation of missing data modalities and features in multi-omic data via shared representation learning.

bioRxiv : the preprint server for biology·2026
Same author

Improvements to Casanovo, a Deep Learning <i>De Novo</i> Peptide Sequencer.

Journal of proteome research·2025
Same journal

MCFST: Spatial domain identification method based on multi-view graph convolutional network and graph fusion network.

Bioinformatics (Oxford, England)·2026
Same journal

SpaBiT: Enhancing Spatial Transcriptomics Resolution via Bidirectional Attention Transformers.

Bioinformatics (Oxford, England)·2026
Same journal

EDEL: Enhancing Dense Retrievers for Curation of Biomedical Knowledge Bases.

Bioinformatics (Oxford, England)·2026
Same journal

Informative Relational Learning for Adverse Reaction Prediction with Enhanced Generalization to Novel Drugs.

Bioinformatics (Oxford, England)·2026
Same journal

An interpretable deep learning framework uncovers features governing CRISPR-Cas9 genome-editing efficiency.

Bioinformatics (Oxford, England)·2026
Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026
See all related articles

Kernel methods, like the support vector machine (SVM), can integrate diverse biological data. An unweighted SVM approach performs well for two data types, but weighted methods are better for multiple noisy datasets.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Machine Learning

Background:

  • Integrating diverse biological data (DNA, protein sequences, structures, expression data, networks) requires robust theoretical frameworks.
  • Kernel methods, particularly the support vector machine (SVM), offer a powerful approach for combining heterogeneous biological datasets.
  • SVM extensions allow for weighting datasets based on their utility in classification tasks.

Purpose of the Study:

  • To empirically evaluate the performance of the SVM for inferring gene functional annotations using combined protein sequence and structure data.
  • To compare the effectiveness of weighted versus unweighted SVM approaches when integrating multiple biological data sources.

Main Methods:

  • Empirical investigation of support vector machine (SVM) performance.

Related Experiment Videos

  • Utilizing combined protein sequence and structure data for gene functional annotation inference.
  • Comparison of unweighted and weighted kernel methods.
  • Main Results:

    • The SVM demonstrates robustness to noise in biological datasets.
    • For two data types, an unweighted SVM performs comparably to or better than weighted methods.
    • When integrating multiple noisy datasets, weighted approaches outperform naive unweighted combinations.
    • A naive unweighted sum of kernels may suffice for many applications.

    Conclusions:

    • The support vector machine is a versatile tool for integrating diverse biological data.
    • The choice between weighted and unweighted kernel methods depends on the number and noise level of the integrated datasets.
    • For simpler integration tasks, unweighted approaches are efficient and effective.