Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Visual System

Visual System

Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

Color Vision

Color Vision

Color perception begins in the retina, the light-sensitive layer at the back of the eye. Two main theories explain how colors are seen: the trichromatic theory and the opponent-process theory. The trichromatic theory, proposed by Thomas Young in 1802 and extended by Hermann von Helmholtz in 1852, suggests that color vision is based on three types of cone receptors in the retina. These cones are sensitive to different but overlapping ranges of wavelengths corresponding to red, blue, and green.

Anatomy of the Eyeball

Anatomy of the Eyeball

The eye is a spherical, hollow structure composed of three tissue layers. The outer layer — the fibrous tunic, comprises the sclera — a white structure — and the cornea, which is transparent. The sclera encompasses some of the ocular surface, most of which is not visible. However, the 'white of the eye' is distinctively visible in humans compared to other species. The cornea, a clear covering at the front of the eye, enables light penetration. The eye's middle...

Parallel Processing

Parallel Processing

The brain processes sensory information rapidly due to parallel processing, which involves sending data across multiple neural pathways at the same time. This method allows the brain to manage various sensory qualities, such as shapes, colors, movements, and locations, all concurrently. For instance, when observing a forest landscape, the brain simultaneously processes the movement of leaves, the shapes of trees, the depth between them, and the various shades of green. This enables a quick and...

The Retina

The Retina

The retina is a layer of nervous tissue at the back of the eye that transduces light into neural signals. This process, called phototransduction, is carried out by rod and cone photoreceptor cells in the back of the retina.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Sensing the Action: Rethinking Sensor Modalities and Multi-Modal Fusion in Vision-Language-Action Models for Robotic Manipulation.

Sensors (Basel, Switzerland)·2026

Same author

Baseline Clinical and Neuropsychological Characteristics of Amyloid PET-Confirmed Alzheimer's Disease Treated With Lecanemab: Early Experience at a Tertiary Hospital in Korea.

Dementia and neurocognitive disorders·2026

Same author

Demographic, Social, and Clinical Profiles of Patients Initiating Lecanemab in Clinical Practice: A Single-Center Experience in Korea.

Dementia and neurocognitive disorders·2026

Same author

Cell flocculation and phase-separation support macro-scale tissue slab construction in a scaffold-free manner.

Materials today. Bio·2026

Same author

Keyword-Conditioned Image Segmentation via the Cross-Attentive Alignment of Language and Vision Sensor Data.

Sensors (Basel, Switzerland)·2025

Same author

Scene Graph and Natural Language-Based Semantic Image Retrieval Using Vision Sensor Data.

Sensors (Basel, Switzerland)·2025

Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026

Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026

Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026

Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026

Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026

Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 2, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

Rethinking Attention Mechanisms in Vision Transformers with Graph Structures.

Hyeongjin Kim¹, Byoung Chul Ko¹

¹Department of Computer Engineering, Keimyung University, Daegu 42601, Republic of Korea.

Sensors (Basel, Switzerland)

|February 24, 2024

Summary

This summary is machine-generated.

This study introduces Graph Head Attention Vision Transformer (GHA-ViT), improving image analysis by maintaining local and global patch information. GHA-ViT enhances performance and reduces parameters compared to standard Vision Transformers.

Keywords:

graph attention network graph head attention lightweight model multi-head attention vision transformer

More Related Videos

How to Build a Dichoptic Presentation System That Includes an Eye Tracker

How to Build a Dichoptic Presentation System That Includes an Eye Tracker

Published on: September 6, 2017

Visualizing Visual Adaptation

Visualizing Visual Adaptation

Published on: April 24, 2017

Related Experiment Videos

Last Updated: Jul 2, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

How to Build a Dichoptic Presentation System That Includes an Eye Tracker

How to Build a Dichoptic Presentation System That Includes an Eye Tracker

Published on: September 6, 2017

Visualizing Visual Adaptation

Visualizing Visual Adaptation

Published on: April 24, 2017

Area of Science:

Computer Vision
Machine Learning
Artificial Intelligence

Background:

Standard Vision Transformers (ViT) utilize Multi-Head Attention (MHA), which is parameter-intensive and can compromise image locality.
There is a need for more efficient and effective ViT architectures that preserve spatial information.

Purpose of the Study:

To propose a novel Vision Transformer architecture, GHA-ViT, incorporating Graph Head Attention (GHA).
To enhance the performance of ViTs while reducing computational complexity and parameter count.

Main Methods:

Replaced the Multi-Head Attention (MHA) mechanism in standard ViTs with a novel Graph Head Attention (GHA).
Applied graph structures to the attention heads of the transformer to better capture relationships within image patches.
Evaluated GHA-ViT on various datasets including CIFAR-10/100, MNIST, MNIST-F, and ImageNet-1K.

Main Results:

GHA-ViT demonstrated superior performance over pure ViT models across multiple datasets.
Achieved a Top-1 accuracy of 81.7% on ImageNet-1K with the GHA-B model (approx. 29M parameters).
Significantly reduced parameters (17-fold) and improved performance (0.4%/4.3%) on CIFAR-10/100 compared to existing ViTs.

Conclusions:

The proposed GHA-ViT effectively maintains both locality and globality of image patches, ensuring attention diversity.
GHA-ViT presents a promising lightweight alternative to current state-of-the-art ViT models, balancing accuracy, parameter count, and computational operations.