Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Association Areas of the Cortex01:21

Association Areas of the Cortex

4.9K
Association areas are regions of the cerebral cortex that do not have a specific sensory or motor function. Instead, they integrate and interpret information from various sources to enable higher cognitive processes such as memory, learning, and decision-making. Some key association areas include the following:
Prefrontal Association Area: This area is located in the frontal lobe and is involved in planning, decision-making, and moderating social behavior. It connects with primary motor areas,...
4.9K
Vision01:24

Vision

52.9K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
52.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Classifying irritable bowel syndrome using spatio-temporal graph convolution networks on brain functional MRI data.

Brain communications·2026
Same author

In Vivo Microendoscopy in the Near-Infrared II Window.

Small (Weinheim an der Bergstrasse, Germany)·2026
Same author

Learning MRI artefact removal with unpaired data.

Nature machine intelligence·2026
Same author

DVG-Diffusion: Dual-View-Guided Diffusion Model for CT Reconstruction From X-Rays.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

EinsPT: Efficient Instance-Aware Pre-Training of Vision Foundation Models.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Real-time near-infrared II fluorescence navigation of magnetic nanorobots for image-guided therapy.

Science advances·2026
Same journal

Style-Aware Contrastive Test-Time Adaptation: A Dual-Cache Model for Robust Vision-Language Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Semantic Frame Interpolation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Physics-Guided Cross-Modal Decoupling with Test-Time Adaptation for Hyperspectral Image Restoration.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: May 24, 2025

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
12:39

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

7.6K

Intra- and Inter-Head Orthogonal Attention for Image Captioning.

Xiaodan Zhang, Aozhe Jia, Junzhong Ji

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |March 3, 2025
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces Orthogonal Attention (I²OA) to improve multi-head attention in image captioning. It enhances attention focus and reduces redundancy, leading to better image descriptions.

    More Related Videos

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
    04:48

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

    Published on: July 5, 2024

    348
    Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments
    13:00

    Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

    Published on: January 23, 2017

    9.8K

    Related Experiment Videos

    Last Updated: May 24, 2025

    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
    12:39

    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

    Published on: January 18, 2020

    7.6K
    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
    04:48

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

    Published on: July 5, 2024

    348
    Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments
    13:00

    Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

    Published on: January 23, 2017

    9.8K

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • Machine Learning

    Background:

    • Multi-head attention (MA) is crucial for image captioning, enabling models to focus on key information from different representation subspaces.
    • Current MA methods lack mechanisms to ensure appropriate attention distribution across subspaces, leading to over-focused heads and redundancy.
    • This limits the diversity and representation power of attention mechanisms in image captioning.

    Purpose of the Study:

    • To propose a novel Intra- and Inter-Head Orthogonal Attention (I²OA) mechanism to enhance MA for image captioning.
    • To address the issues of over-focused attention and head redundancy in existing MA models.
    • To improve the performance of image captioning models without increasing complexity or parameters.

    Main Methods:

    • Introduced Intra-Head Orthogonal Attention to decentralize attention from object-centric to content-aware by applying orthogonal constraints within each head.
    • Implemented Inter-Head Orthogonal Attention to reduce redundancy between heads by applying orthogonal constraints across heads, enhancing subspace diversity.
    • Integrated I²OA into existing multi-head attention-based image captioning frameworks.

    Main Results:

    • The proposed I²OA method effectively improved the performance of image captioning models on the MS COCO dataset.
    • Intra-Head Orthogonal Attention led to more comprehensive content-aware attention.
    • Inter-Head Orthogonal Attention successfully reduced head redundancy and increased representation diversity.

    Conclusions:

    • I²OA offers an efficient and flexible approach to improve multi-head attention for image captioning.
    • The orthogonal regularization effectively addresses limitations in attention focus and head redundancy.
    • The method demonstrates significant performance gains without additional model complexity or parameters.