Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Hierarchical Consistency Learning for Test-Time Adaptation in Camouflage Perception.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Knowledge Diffusion-Based Adaptive Alignment with Hierarchical Context for Video Temporal Grounding.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

OmniCharacter++: Towards Comprehensive Benchmark for Realistic Role-Playing Agents.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Determination of non-volatile metabolic profiles and their sensory relevance in different grades of brandy through widely targeted metabolomics.

Food chemistry: X·2026

Same author

Vision-Language Collaborative Representation Learning for Action Quality Assessment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Atlas of predicted protein complex structures across kingdoms.

Nature communications·2026

Same journal

Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Learning Gaze Synthesizer via 3D-eye Controlled Diffusion and Cross-domain Feature Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Underlying Semantic Diffusion for Effective and Efficient In-Context Learning.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

DiffRES: Unleashing Text-to-Image Diffusion Models for Generative Referring Expression Segmentation without Information Leakage.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Location Matters: Frequency-Spatial Dual Space Adaptation for Cross-Domain Few-Shot Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BayeTopo: Bayesian-based Topology-guided Learning for Vascular Imaging Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 8, 2026

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Exploiting Depth From Single Monocular Images for Object Detection and Semantic Segmentation.

Yuanzhouhan Cao, Chunhua Shen, Heng Tao Shen

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|January 24, 2017

Summary

This summary is machine-generated.

Augmenting standard images with estimated depth information significantly enhances computer vision tasks like object detection and semantic segmentation. This approach leverages deep learning for depth estimation, improving accuracy even without specialized sensors.

More Related Videos

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Related Experiment Videos

Last Updated: Mar 8, 2026

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Area of Science:

Computer Vision
Deep Learning
Machine Learning

Background:

Measured depth data, often from sensors like Microsoft Kinect, improves computer vision tasks.
Most available images lack depth information, limiting their utility in depth-aware applications.
Existing methods rely on direct depth measurement, which is not universally accessible.

Purpose of the Study:

To demonstrate that estimated depth can augment RGB images to improve computer vision task performance.
To develop and evaluate methods for integrating estimated depth features with RGB data.
To propose a novel RGB-D semantic segmentation approach using multi-task learning.

Main Methods:

Learned a deep depth estimation model from monocular RGB images.
Extracted deep depth features from estimated depth maps.
Combined RGB and estimated depth features for object detection and semantic segmentation.
Developed an RGB-D semantic segmentation method using a multi-task learning framework (semantic prediction and depth regression).

Main Results:

Augmenting RGB data with estimated depth significantly improved object detection accuracy.
Incorporating estimated depth information led to remarkable improvements in semantic segmentation performance.
The proposed multi-task learning approach for RGB-D semantic segmentation showed enhanced results.

Conclusions:

Estimated depth from monocular images is a viable and effective substitute for measured depth in computer vision.
Integrating estimated depth features alongside RGB data offers substantial performance gains for object detection and semantic segmentation.
The developed methods provide a practical way to leverage depth information from standard images, broadening the applicability of depth-aware computer vision techniques.