Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Integrating structured biological data by Kernel Maximum Mean Discrepancy.

Karsten M Borgwardt1, Arthur Gretton, Malte J Rasch

  • 1Institute for Computer Science, Ludwig-Maximilians-University, Munich, Germany. kb@dbs.ifi.lmu.de

Bioinformatics (Oxford, England)
|July 29, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A critical perspective on finite sample conformal prediction theory in medical applications.

Artificial intelligence in medicine·2026
Same author

Imagining and building wise machines: the centrality of AI metacognition.

Trends in cognitive sciences·2026
Same author

Latent Causal Diffusions for Single-Cell Perturbation Modeling.

ArXiv·2026
Same author

In silico biological discovery with large perturbation models.

Nature computational science·2025
Same author

Demonstration of transformer-based ALBERT model on a 14nm analog AI inference chip.

Nature communications·2025
Same author

Early warning of complex climate risk with integrated artificial intelligence.

Nature communications·2025
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

We developed a novel kernel-based statistical test, Maximum Mean Discrepancy (MMD), to determine if two datasets originate from the same distribution. This method accurately integrates diverse biological data, outperforming existing approaches.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Statistical Learning

Background:

  • Data integration in bioinformatics often requires determining if two observation sets share the same underlying distribution.
  • Existing methods face challenges with diverse data types common in molecular biology.

Purpose of the Study:

  • To propose a novel kernel-based statistical test for comparing distributions of biological data.
  • To address the challenge of data integration across various formats including vectors, strings, sequences, and graphs.

Main Methods:

  • The study introduces the Maximum Mean Discrepancy (MMD) test, a kernel-based statistical method.
  • MMD leverages the kernel trick to compute the maximum discrepancy between function means across distributions.
  • The test is applicable to multivariate and structured data types.

Related Experiment Videos

Main Results:

  • MMD was evaluated on critical data integration tasks: microarray data comparability, cancer diagnosis, and schema matching for protein function.
  • The MMD test demonstrated high accuracy in identifying samples from the same distribution, even in high-dimensional settings.
  • MMD outperformed competing methods in all tested experiments.

Conclusions:

  • A novel, fast, and easy-to-implement statistical test (MMD) was developed for assessing if two samples originate from the same distribution.
  • The MMD test is versatile, supporting both multivariate and structured biological data.
  • Experimental results confirm the effectiveness and accuracy of the MMD test in practical data integration scenarios.