Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Echinococcus multilocularis serine protease inhibitor 1 (EmSPI-1): a highly effective serodiagnostic antigen for alveolar echinococcosis.

Clinica chimica acta; international journal of clinical chemistry·2026
Same author

Riemannian Implicit Differentiation via a Fixed-Point Equation for Riemannian Bilevel Optimization.

IEEE transactions on neural networks and learning systems·2025
Same author

End-to-End Open-Vocabulary Video Visual Relationship Detection Using Multi-Modal Prompting.

IEEE transactions on pattern analysis and machine intelligence·2025
Same author

The genetic variation of mitochondrial sequences and pathological differences of <i>Echinococcus multilocularis</i> strains from different continents.

Microbiology spectrum·2025
Same author

Temperature has an enhanced role in sediment N<sub>2</sub>O and N<sub>2</sub> fluxes in wider rivers.

Water research·2025
Same author

Drug repurposing for hard-to-treat human alveolar echinococcosis: pyronaridine and beyond.

Parasitology·2024
Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: Aug 4, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

646

Adaptive Latent Graph Representation Learning for Image-Text Matching.

Mengxiao Tian, Xinxiao Wu, Yunde Jia

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |April 4, 2023
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces an adaptive latent graph method to improve image-text matching by reducing distractions. The approach enhances common embedding spaces for better cross-modal understanding.

    More Related Videos

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.0K
    Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
    05:47

    Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

    Published on: June 13, 2025

    402

    Related Experiment Videos

    Last Updated: Aug 4, 2025

    Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
    03:14

    Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

    Published on: December 6, 2024

    646
    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.0K
    Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
    05:47

    Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

    Published on: June 13, 2025

    402

    Area of Science:

    • Computer Vision
    • Natural Language Processing
    • Machine Learning

    Background:

    • Image-text matching faces challenges due to the modality gap.
    • Existing methods modeling entity relationships are susceptible to irrelevant visual and textual information.
    • Distractions in entity relationships hinder the learning of effective common embedding spaces.

    Purpose of the Study:

    • To propose an adaptive latent graph representation learning method for image-text matching.
    • To reduce distractions from irrelevant entities in images and noisy words in text.
    • To narrow the modality gap and improve matching performance.

    Main Methods:

    • Utilized an improved graph variational autoencoder to disentangle distracting factors from latent relationship factors.
    • Jointly learned latent textual graph representations, latent visual graph representations, and a visual-textual graph embedding space.
    • Introduced an adaptive cross-attention mechanism for feature attending on latent graph representations across modalities.

    Main Results:

    • Demonstrated significant effectiveness of the proposed method on the Flickr30K and COCO datasets.
    • The adaptive latent graph approach successfully reduced distractions from irrelevant visual and textual elements.
    • The adaptive cross-attention mechanism further enhanced feature alignment between image and text modalities.

    Conclusions:

    • The proposed adaptive latent graph representation learning method effectively addresses distractions in image-text matching.
    • The method successfully narrows the modality gap, leading to improved matching performance.
    • This approach offers a promising direction for future research in cross-modal retrieval and understanding.