Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Color Vision01:24

Color Vision

1.9K
Color perception begins in the retina, the light-sensitive layer at the back of the eye. Two main theories explain how colors are seen: the trichromatic theory and the opponent-process theory. The trichromatic theory, proposed by Thomas Young in 1802 and extended by Hermann von Helmholtz in 1852, suggests that color vision is based on three types of cone receptors in the retina. These cones are sensitive to different but overlapping ranges of wavelengths corresponding to red, blue, and green.
1.9K
Vision01:24

Vision

61.7K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
61.7K
Visual System01:26

Visual System

2.3K
Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...
2.3K
Photoreceptors and Visual Pathways01:22

Photoreceptors and Visual Pathways

11.2K
At the molecular level, visual signals trigger transformations in photopigment molecules, resulting in changes in the photoreceptor cell's membrane potential. The photon's energy level is denoted by its wavelength, with each specific wavelength of visible light associated with a distinct color. The spectral range of visible light, classified as electromagnetic radiation, spans from 380 to 720 nm. Electromagnetic radiation wavelengths exceeding 720 nm fall under the infrared category,...
11.2K
Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

2.7K
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
2.7K
Anatomy of the Eyeball01:20

Anatomy of the Eyeball

11.9K
The eye is a spherical, hollow structure composed of three tissue layers. The outer layer — the fibrous tunic, comprises the sclera — a white structure — and the cornea, which is transparent. The sclera encompasses some of the ocular surface, most of which is not visible. However, the 'white of the eye' is distinctively visible in humans compared to other species. The cornea, a clear covering at the front of the eye, enables light penetration. The eye's middle...
11.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Advancing Causal Intervention in Image Captioning With Causal Prompt.

IEEE transactions on neural networks and learning systems·2025
Same author

Prompt Tuning of Deep Neural Networks for Speaker-Adaptive Visual Speech Recognition.

IEEE transactions on pattern analysis and machine intelligence·2024
Same author

Enabling Visual Object Detection With Object Sounds via Visual Modality Recalling Memory.

IEEE transactions on neural networks and learning systems·2023
Same author

Deep learning-based classification system of bacterial keratitis and fungal keratitis using anterior segment images.

Frontiers in medicine·2023
Same author

Stereoscopic Vision Recalling Memory for Monocular 3D Object Detection.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2023
Same author

Advancing Adversarial Training by Injecting Booster Signal.

IEEE transactions on neural networks and learning systems·2023
Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Multi-Branch Tree-based Fusion Neural Architecture Search with Zero-Cost Screen for Multi-Modal Classification.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: Apr 11, 2026

Visualizing Visual Adaptation
04:43

Visualizing Visual Adaptation

Published on: April 24, 2017

9.8K

A Causal Lens on Non-RGB Vision Sensor Understanding in Vision-Language Models.

Youngjoon Yu, Yong Man Ro

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |April 9, 2026
    PubMed
    Summary
    This summary is machine-generated.

    Vision-Language Models (VLMs) struggle with non-RGB data due to bias. A new benchmark and causal framework improve VLMs' understanding of thermal, depth, and X-ray sensors.

    More Related Videos

    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss
    07:12

    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

    Published on: April 11, 2025

    1.1K
    Lensless Fluorescent Microscopy on a Chip
    11:23

    Lensless Fluorescent Microscopy on a Chip

    Published on: August 17, 2011

    18.4K

    Related Experiment Videos

    Last Updated: Apr 11, 2026

    Visualizing Visual Adaptation
    04:43

    Visualizing Visual Adaptation

    Published on: April 24, 2017

    9.8K
    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss
    07:12

    Development of a Gaze-Contingent Display Framework Designed for Perceptual and Oculomotor Research with Simulated Central Vision Loss

    Published on: April 11, 2025

    1.1K
    Lensless Fluorescent Microscopy on a Chip
    11:23

    Lensless Fluorescent Microscopy on a Chip

    Published on: August 17, 2011

    18.4K

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • Machine Learning

    Background:

    • Vision-Language Models (VLMs) excel with RGB images but fail with non-RGB sensor data (thermal, depth, hyperspectral, X-ray).
    • This failure is due to an RGB-centric bias, causing VLMs to misinterpret unique physical properties of non-RGB modalities.

    Purpose of the Study:

    • To systematically evaluate and address the RGB-centric bias in VLMs using non-RGB sensor data.
    • To introduce a novel benchmark suite, CausalSense, and a causal learning framework to mitigate this bias.

    Main Methods:

    • Developed CausalSense, a benchmark suite for evaluating VLM bias on non-RGB data.
    • Designed a causal learning framework using confounder dictionaries and backdoor adjustments.
    • Integrated sensor-specific knowledge into VLMs without extensive retraining.

    Main Results:

    • State-of-the-art VLMs show significant performance deficits with non-RGB sensor comprehension.
    • The proposed causal deconfounded cross-modal encoder substantially improved VLM reasoning about physical attributes.
    • A measurable reduction in the performance gap was achieved.

    Conclusions:

    • Current VLMs exhibit a critical RGB-centric bias limiting their use with diverse sensor data.
    • The CausalSense benchmark and causal framework enable more resilient, sensor-aware VLMs.
    • This research facilitates VLM interpretation of phenomena beyond the visible spectrum.