Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Learning Disabilities

Learning Disabilities

Learning disabilities are cognitive disorders caused by neurological impairments that affect cognitive functions like language and reading, without indicating overall intellectual or developmental challenges. These disabilities differ from global intellectual or developmental disabilities as they are limited to distinct cognitive functions. Common learning disabilities include dysgraphia, dyslexia, and dyscalculia, each of which impacts unique aspects of learning.
Dyslexia
Dyslexia is a...

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Generalized Hierarchical Co-Saliency Learning for Label-Efficient Tracking.

Sensors (Basel, Switzerland)·2025

Same author

Query-Based Object Visual Tracking with Parallel Sequence Generation.

Sensors (Basel, Switzerland)·2024

Same author

Jointly Modeling Motion and Appearance Cues for Robust RGB-T Tracking.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2021

Same author

Using Octuplet Siamese Network For Osteoporosis Analysis On Dental Panoramic Radiographs.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2018

Same author

Fast and Robust Object Tracking via Probability Continuous Outlier Model.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2015

Same author

Visual Tracking via Weighted Local Cosine Similarity.

IEEE transactions on cybernetics·2014

Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 9, 2026

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

Self-Adaptive Vision-Language Tracking With Context Prompting.

Jie Zhao, Xin Chen, Shengming Li

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|December 8, 2025

Summary

This summary is machine-generated.

This study introduces a self-adaptive vision-language tracking framework using CLIP to bridge modality gaps. The method dynamically adapts language cues to visual context, enhancing tracking robustness and performance.

More Related Videos

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Published on: October 3, 2018

Exploring Infant Sensitivity to Visual Language using Eye Tracking and the Preferential Looking Paradigm

Exploring Infant Sensitivity to Visual Language using Eye Tracking and the Preferential Looking Paradigm

Published on: May 15, 2019

Related Experiment Videos

Last Updated: Jan 9, 2026

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Using the Visual World Paradigm to Study Sentence Comprehension in Mandarin-Speaking Children with Autism

Published on: October 3, 2018

Exploring Infant Sensitivity to Visual Language using Eye Tracking and the Preferential Looking Paradigm

Exploring Infant Sensitivity to Visual Language using Eye Tracking and the Preferential Looking Paradigm

Published on: May 15, 2019

Area of Science:

Computer Vision
Natural Language Processing
Artificial Intelligence

Background:

Existing vision-language tracking methods struggle with modality gaps and the mismatch between static language and dynamic visual information.
This performance limitation hinders the effective use of language semantics to improve tracking robustness.

Purpose of the Study:

To propose a self-adaptive vision-language tracking framework that effectively bridges the modality gap.
To enhance tracking robustness by enabling language features to dynamically evolve with visual context.

Main Methods:

Leveraging the pre-trained multi-modal CLIP model for aligned visual-language representations.
Introducing a context-aware prompting mechanism for dynamic adaptation of linguistic cues based on visual context.
Employing a unified one-stream Transformer architecture for joint vision-only and vision-language tracking training.

Main Results:

The proposed framework effectively bridges the modality gap and enhances tracking robustness.
The large model achieved 55.0% AUC on LaSOT_EXT and 69.0% AUC on TNL2K.
The language-only tracking model demonstrated performance comparable to state-of-the-art vision-only methods on TNL2K.

Conclusions:

The self-adaptive framework successfully leverages language advantages to improve visual tracking.
Dynamic adaptation of language embeddings to evolving visual context is key to enhanced robustness.
The unified architecture supports versatile training scenarios, advancing vision-language tracking research.