Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

1.3K
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
1.3K
Visual System01:26

Visual System

1.3K
Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...
1.3K
Vision01:24

Vision

58.0K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
58.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A retrieval-augmented framework enabling VLM spatial awareness for object-centric robot manipulation.

Science robotics·2026
Same author

The behavior biopsy: Interpreting animal behavior as embodied, situated, and hierarchical.

Current opinion in neurobiology·2026
Same author

Probing Effective and Efficient Category-Level Articulated Object Pose Perception.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

EvolveNav: Empowering LLM-Based Vision-Language Navigation via Self-Improving Embodied Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Integrated multi-omics analysis unveils microbiota-metabolite-host interactions and novel biomarkers for early diabetic kidney disease diagnosis.

Frontiers in immunology·2026
Same author

From sparse semantics to rich instances: Empowering label-efficient LiDAR panoptic segmentation via geometric priors.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Nov 9, 2025

From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
12:08

From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data

Published on: August 13, 2014

24.8K

Understanding Pixel-Level 2D Image Semantics With 3D Keypoint Knowledge Engine.

Yang You, Chengkun Li, Yujing Lou

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |April 13, 2021
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a novel computer vision method for pixel-level 2D object understanding by leveraging 3D semantic information. The approach enhances semantic understanding by predicting 3D semantics and projecting them back to 2D images.

    More Related Videos

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.3K
    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
    04:48

    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

    Published on: November 30, 2022

    3.1K

    Related Experiment Videos

    Last Updated: Nov 9, 2025

    From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
    12:08

    From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data

    Published on: August 13, 2014

    24.8K
    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.3K
    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
    04:48

    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

    Published on: November 30, 2022

    3.1K

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • 3D Object Recognition

    Background:

    • Current 2D object understanding methods lose crucial 3D spatial information.
    • End-to-end training on 2D images limits deep object comprehension (e.g., functionality, affordance).

    Purpose of the Study:

    • To develop a novel method for pixel-level 2D object semantic understanding.
    • To bridge the gap between 2D image analysis and 3D spatial information for improved machine perception.

    Main Methods:

    • Proposed a new method that predicts 2D image semantics in the 3D domain.
    • Introduced KeypointNet, a large-scale keypoint knowledge engine with 103,450 keypoints and 8,234 3D models.
    • Leveraged 3D vision advantages for explicit reasoning about object self-occlusion and visibility.

    Main Results:

    • Achieved comparative and superior results on standard semantic benchmarks.
    • Demonstrated effective pixel-level semantic understanding by integrating 3D data.
    • Successfully addressed limitations of purely 2D-based approaches.

    Conclusions:

    • The proposed method enhances 2D object semantic understanding by utilizing 3D information.
    • KeypointNet provides a valuable resource for 3D semantic label generation.
    • This approach offers a more comprehensive understanding of objects in computer vision systems.