Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

2.5K
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
2.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Hierarchical Consistency Learning for Test-Time Adaptation in Camouflage Perception.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Knowledge Diffusion-Based Adaptive Alignment with Hierarchical Context for Video Temporal Grounding.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

OmniCharacter++: Towards Comprehensive Benchmark for Realistic Role-Playing Agents.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Determination of non-volatile metabolic profiles and their sensory relevance in different grades of brandy through widely targeted metabolomics.

Food chemistry: X·2026
Same author

Vision-Language Collaborative Representation Learning for Action Quality Assessment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Atlas of predicted protein complex structures across kingdoms.

Nature communications·2026
Same journal

Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Learning Gaze Synthesizer via 3D-eye Controlled Diffusion and Cross-domain Feature Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Underlying Semantic Diffusion for Effective and Efficient In-Context Learning.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

DiffRES: Unleashing Text-to-Image Diffusion Models for Generative Referring Expression Segmentation without Information Leakage.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Location Matters: Frequency-Spatial Dual Space Adaptation for Cross-Domain Few-Shot Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BayeTopo: Bayesian-based Topology-guided Learning for Vascular Imaging Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: Mar 8, 2026

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

1.2K

Exploiting Depth From Single Monocular Images for Object Detection and Semantic Segmentation.

Yuanzhouhan Cao, Chunhua Shen, Heng Tao Shen

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |January 24, 2017
    PubMed
    Summary
    This summary is machine-generated.

    Augmenting standard images with estimated depth information significantly enhances computer vision tasks like object detection and semantic segmentation. This approach leverages deep learning for depth estimation, improving accuracy even without specialized sensors.

    More Related Videos

    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
    04:48

    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

    Published on: November 30, 2022

    3.6K
    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.7K

    Related Experiment Videos

    Last Updated: Mar 8, 2026

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
    03:31

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

    Published on: December 15, 2023

    1.2K
    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
    04:48

    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

    Published on: November 30, 2022

    3.6K
    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.7K

    Area of Science:

    • Computer Vision
    • Deep Learning
    • Machine Learning

    Background:

    • Measured depth data, often from sensors like Microsoft Kinect, improves computer vision tasks.
    • Most available images lack depth information, limiting their utility in depth-aware applications.
    • Existing methods rely on direct depth measurement, which is not universally accessible.

    Purpose of the Study:

    • To demonstrate that estimated depth can augment RGB images to improve computer vision task performance.
    • To develop and evaluate methods for integrating estimated depth features with RGB data.
    • To propose a novel RGB-D semantic segmentation approach using multi-task learning.

    Main Methods:

    • Learned a deep depth estimation model from monocular RGB images.
    • Extracted deep depth features from estimated depth maps.
    • Combined RGB and estimated depth features for object detection and semantic segmentation.
    • Developed an RGB-D semantic segmentation method using a multi-task learning framework (semantic prediction and depth regression).

    Main Results:

    • Augmenting RGB data with estimated depth significantly improved object detection accuracy.
    • Incorporating estimated depth information led to remarkable improvements in semantic segmentation performance.
    • The proposed multi-task learning approach for RGB-D semantic segmentation showed enhanced results.

    Conclusions:

    • Estimated depth from monocular images is a viable and effective substitute for measured depth in computer vision.
    • Integrating estimated depth features alongside RGB data offers substantial performance gains for object detection and semantic segmentation.
    • The developed methods provide a practical way to leverage depth information from standard images, broadening the applicability of depth-aware computer vision techniques.