Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

IRS-Assisted Dual-Mode Relay-Based Adaptive Transmission.

Sensors (Basel, Switzerland)·2025

Same author

Impact of intraoperative neurophysiological monitoring and anesthesia management parameters on postoperative recovery in patients undergoing complex intracranial aneurysm surgery.

Journal of clinical neuroscience : official journal of the Neurosurgical Society of Australasia·2025

Same author

Virtual Signal Processing-Based Integrated Multi-User Detection.

Sensors (Basel, Switzerland)·2025

Same author

DFSP: A fast and automatic distance field-based stem-leaf segmentation pipeline for point cloud of maize shoot.

Frontiers in plant science·2023

Same author

A Transmission Efficiency Evaluation Method of Adaptive Coding Modulation for Ka-Band Data-Transmission of LEO EO Satellites.

Sensors (Basel, Switzerland)·2022

Same author

Panchromatic Image Super-Resolution Via Self Attention-Augmented Wasserstein Generative Adversarial Network.

Sensors (Basel, Switzerland)·2021

Same journal

Turbulent flow in a vortex separator with a directed pipe inlet.

Scientific reports·2026

Same journal

Systematic characteristic evaluation of clay-based cementitious material derived from calcium carbide residue and waste tile powder.

Scientific reports·2026

Same journal

Retraction Note: Improvement of a rapid diagnostic application of monoclonal antibodies against avian influenza H7 subtype virus using Europium nanoparticles.

Scientific reports·2026

Same journal

Applying large language models to spam detection in the Kazakh low-resource language setting.

Scientific reports·2026

Same journal

An open-source 3D printing system enabling in-situ freeze-thaw processing of hydrogels.

Scientific reports·2026

Same journal

An enhanced EfficientNet framework for automated waste classification using cosine annealing and label smoothing.

Scientific reports·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 22, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Feature refinement and rethinking attention for remote sensing image captioning.

Yunpeng Li^1,2, Chengjin Tao^1,2, Meng Liu^1,2

¹The Jiangsu Province Engineering Research Center of Integrated Circuit Reliability Technology and Testing System, Wuxi University, Wuxi, 214105, China.

Scientific Reports

|March 14, 2025

Summary

This summary is machine-generated.

This study introduces a novel framework for remote sensing image captioning that refines features and uses rethinking attention. The approach improves accuracy by allowing models to reconsider visual information, leading to better descriptions.

Keywords:

Feature refinement Remote sensing image captioning Rethinking attention mechanism Vision-language Visual perception

More Related Videos

Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

Published on: January 23, 2017

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Published on: August 29, 2018

Related Experiment Videos

Last Updated: May 22, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

Published on: January 23, 2017

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Published on: August 29, 2018

Area of Science:

Computer Science
Artificial Intelligence
Remote Sensing

Background:

Attention mechanisms are crucial for remote sensing image captioning but struggle with restrictive assumptions and weak object correlations.
Existing visual feature extractors can fail when object relationships are not clearly defined.

Purpose of the Study:

To develop an advanced framework for remote sensing image captioning that overcomes limitations of current attention-driven models.
To enhance the accuracy and robustness of image captioning by refining visual features and enabling a 'rethinking' attention process.

Main Methods:

A feature refinement module interacts grid-level features using a refinement gate to weaken irrelevant visual information.
A rethinking attention mechanism with a rethinking LSTM layer allows for spontaneous focus on multiple regions for single-word prediction.
A confidence rectification strategy is employed to model rethinking attention and learn discriminative contextual representations.

Main Results:

The proposed framework demonstrated superior performance across four benchmark datasets: NWPU-Captions, RSICD, UCM-Captions, and Sydney-Captions.
Significant improvements were achieved, particularly on the NWPU-Captions dataset, highlighting the effectiveness of the approach.

Conclusions:

The feature refinement and rethinking attention framework offers a more robust and effective solution for remote sensing image captioning.
The model's ability to reconsider visual focus and refine features leads to more accurate and contextually rich image descriptions.