Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

Elaborative Rehearsals

Elaborative Rehearsals

Elaborative rehearsal is a crucial cognitive strategy that strengthens information encoding in long-term memory by making meaningful connections between new data and pre-existing knowledge. This approach contrasts with maintenance rehearsal, which involves simple repetition without delving into the significance of the information. While maintenance rehearsal might temporarily keep information active in short-term memory, it is less effective for long-term retention.
The effectiveness of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

MonSter++: Unified Stereo Matching, Multi-View Stereo, and Real-Time Stereo With Monodepth Priors.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Medical Referring Image Segmentation via Next-Token Mask Prediction.

IEEE transactions on medical imaging·2026

Same author

Spatial-Temporal Self-Compensating Graph Convolutional Network for Skeleton-Based Action Recognition Under Data Constraints.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Multimodal detection of microplastics in human kidney stones and multi-omics exploration of renal cell metaflammation.

Journal of hazardous materials·2026

Same author

Long&short Exposures Guided Diffusion Model for Realistic Local Motion Deblurring.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Effect of humanistic care-integrated Mini-CEX teachingon nursing students' clinical competence: a quasi-experimental study.

BMC medical education·2026

Same journal

Style-Aware Contrastive Test-Time Adaptation: A Dual-Cache Model for Robust Vision-Language Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Semantic Frame Interpolation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Physics-Guided Cross-Modal Decoupling with Test-Time Adaptation for Hyperspectral Image Restoration.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 24, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Rebalanced Vision-Language Retrieval Considering Structure-Aware Distillation.

Yang Yang, Wenjuan Xi, Luping Zhou

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|March 3, 2025

Summary

This summary is machine-generated.

Modal imbalance in vision-language retrieval hinders performance. This study proposes structure-preserved matching to rebalance modalities, improving cross-modal retrieval accuracy and enhancing single-modal capabilities.

More Related Videos

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

Published on: April 11, 2025

Related Experiment Videos

Last Updated: May 24, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

Published on: April 11, 2025

Area of Science:

Computer Science
Artificial Intelligence
Machine Learning

Background:

Vision-language retrieval seeks to align representations from different modalities in a shared latent space.
Modal balance, where each modality sufficiently represents the others, is a key assumption.
Modal imbalance, caused by noise or insufficient information, is a common challenge impacting retrieval performance.

Purpose of the Study:

To investigate the impact of modal imbalance on cross-modal retrieval.
To propose a novel method for rebalancing modalities and improving retrieval accuracy.
To enhance both cross-modal and single-modal retrieval capabilities.

Main Methods:

Demonstrated that standard cross-modal matching is suboptimal under modal imbalance.
Introduced structure-preserved matching to address challenges in similarity measurement.
Developed a multi-granularity cross-modal matching approach with structure-aware distillation and relational matching.

Main Results:

The proposed method effectively rebalances cross-modal matching by learning structure-preserved representations.
Structure-aware distillation regularizes geometric consistency between cross-modal and intra-modal representations.
Experimental results show superior cross-modal retrieval performance and improved single-modal retrieval.

Conclusions:

Modal imbalance significantly affects cross-modal retrieval, necessitating specialized approaches.
Structure-preserved matching offers a robust solution for rebalancing modalities.
The proposed method achieves state-of-the-art performance, highlighting the importance of structural consistency in cross-modal learning.