Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Asynchronous Remote Clinic Telemedicine for the Monitoring of Keratoconus: Patient Perspectives.

Cornea·2026

Same author

Clinical outcomes and management of idiopathic retroperitoneal fibrosis: a case series.

Annals of medicine and surgery (2012)·2026

Same author

HierRelTriple: Guiding Indoor Layout Generation With Hierarchical Relationship Triplet Losses.

IEEE transactions on visualization and computer graphics·2026

Same author

Cascaded adaptive model predictive and PID control for integrated LFC-AVR enhancement.

Scientific reports·2026

Same author

Retraction Note: Autophagy-mediated degradation of NOTCH1 intracellular domain controls the epithelial to mesenchymal transition and cancer metastasis.

Cell & bioscience·2026

Same author

An adaptive model predictive control approach for robust load frequency control under renewable energy disturbances.

Scientific reports·2026

Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 11, 2025

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

3DCoMPaT⁺⁺: An Improved Large-Scale 3D Vision Dataset for Compositional Recognition.

Habib Slim, Xiang Li, Yuchen Li

IEEE Transactions on Pattern Analysis and Machine Intelligence

|August 11, 2025

Summary

This summary is machine-generated.

This study introduces 3DCOMPAT++, a large multimodal dataset for 3D vision research. It enables new tasks like Grounded CoMPaT Recognition (GCR) for understanding material compositions on 3D objects.

More Related Videos

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Related Experiment Videos

Last Updated: Sep 11, 2025

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Area of Science:

Computer Vision
Machine Learning
3D Data Analysis

Background:

The field of 3D vision requires comprehensive datasets for advancing multimodal and compositional learning.
Existing datasets often lack the scale, detail, or compositional complexity needed for advanced 3D understanding tasks.

Purpose of the Study:

To introduce 3DCOMPAT++, a large-scale multimodal 2D/3D dataset designed to facilitate research in compositional 3D vision.
To establish a new benchmark task, Grounded CoMPaT Recognition (GCR), for recognizing and grounding material compositions on 3D object parts.

Main Methods:

Generation of 160 million rendered views from over 10 million stylized 3D shapes with part-instance level annotations.
Inclusion of diverse data modalities: RGB point clouds, 3D textured meshes, depth maps, and segmentation masks.
Development and evaluation of methods for the novel GCR task, including a modified PointNet++ model.

Main Results:

The 3DCOMPAT++ dataset encompasses 42 shape categories, 275 part categories, and 293 material classes.
The GCR task was explored through a data challenge at CVPR, highlighting effective approaches.
Public release of the dataset and code to support future research.

Conclusions:

3DCOMPAT++ provides a valuable resource for advancing multimodal and compositional learning in 3D vision.
The GCR task and dataset are expected to spur innovation in understanding complex 3D object properties and their compositions.
The work aims to lower the barrier for future research in compositional 3D vision.