Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Metric learning for text documents.

Guy Lebanon1

  • 1Department of Statistics, Purdue University, 150 N. University Street, West Lafayette, IN 47907, USA. lebanon@stat.purdue.edu

IEEE Transactions on Pattern Analysis and Machine Intelligence
|March 29, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Visualizing incomplete and partially ranked data.

IEEE transactions on visualization and computer graphics·2008
Same author

Sequential document visualization.

IEEE transactions on visualization and computer graphics·2007
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

This study introduces a novel method for learning data-driven distance metrics, outperforming standard measures in text classification by optimizing Riemannian geometry for machine learning algorithms.

Area of Science:

  • Machine Learning
  • Differential Geometry
  • Statistical Inference

Background:

  • Machine learning algorithms often require effective distance metrics for optimal performance.
  • Default metrics like Euclidean may not capture the underlying data structure.
  • Learning data-specific metrics is crucial for improving algorithm accuracy.

Purpose of the Study:

  • To develop a method for learning Riemannian metrics from data.
  • To apply this method to the multinomial simplex for text classification.
  • To compare the learned metric against existing measures like tfidf cosine similarity.

Main Methods:

  • Learning a parametric family of Riemannian metrics by maximizing the inverse data volume.
  • Utilizing maximum likelihood estimation under a specific probability model.

Related Experiment Videos

  • Applying pull-back metrics of Fisher information under Lie group transformations.
  • Main Results:

    • The proposed method successfully learns a Riemannian metric tailored to the data.
    • The learned geodesic distance measure demonstrated superior performance compared to tfidf cosine similarity in text document classification.
    • The approach provides a statistically grounded method for metric learning.

    Conclusions:

    • Data-driven Riemannian metric learning offers significant advantages over default metrics.
    • The developed technique enhances machine learning model performance, particularly in text analysis.
    • This work bridges differential geometry and machine learning for improved data representation.