Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Basophils as a primary inducer of the T helper type 2 immunity in ovalbumin-induced allergic airway inflammation.

Immunology·2014

Same author

Longitudinal variability of phosphorus fractions in sediments of a canyon reservoir due to cascade dam construction: a case study in Lancang River, China.

PloS one·2014

Same author

Haemolytic activity and adjuvant effect of soyasaponins and some of their derivatives on the immune responses to ovalbumin in mice.

International immunopharmacology·2013

Same author

[Clinical observation of neutralizing heparin with protamine in carotid endarterectomy].

Zhonghua yi xue za zhi·2013

Same author

Solution-chemical route to generalized synthesis of metal germanate nanowires with room-temperature, light-driven hydrogenation activity of CO2 into renewable hydrocarbon fuels.

Inorganic chemistry·2013

Same author

Exponential strand-displacement amplification for detection of microRNAs.

Analytical chemistry·2013

Same journal

Dynamic analysis and reliable mechanical optimization application of ring HNN effected with a memristive neuron.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

DAFF-Net: A detection and search method for small-scale low surface brightness galaxies.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Quasi-synchronization for complex networks with hybrid pinning intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Physics-encoded convolutional neural operators for parametric PDEs: A convergence-guaranteed framework via pre-computed kernel fields.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 22, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Towards zero-shot human-object interaction detection via vision-language integration.

Weiying Xue¹, Qi Liu¹, Yuxiao Wang¹

¹School of Future Technology, South China University of Technology, Guangdong Guangzhou, 511400, PR China.

Neural Networks : the Official Journal of the International Neural Network Society

|March 14, 2025

Summary

This summary is machine-generated.

This study introduces a new framework for human-object interaction (HOI) detection, improving zero-shot capabilities by integrating visual-language models (VLMs). The KI2HOI model enhances HOI detection accuracy, especially for unseen object categories.

Keywords:

Human–object interaction Multimodal integration Weakly supervision Zero-shot

More Related Videos

Exploring Infant Sensitivity to Visual Language using Eye Tracking and the Preferential Looking Paradigm

Exploring Infant Sensitivity to Visual Language using Eye Tracking and the Preferential Looking Paradigm

Published on: May 15, 2019

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Published on: October 18, 2024

Related Experiment Videos

Last Updated: May 22, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Exploring Infant Sensitivity to Visual Language using Eye Tracking and the Preferential Looking Paradigm

Exploring Infant Sensitivity to Visual Language using Eye Tracking and the Preferential Looking Paradigm

Published on: May 15, 2019

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Author Spotlight: Insights into the Analysis of Human Interaction with 3D Virtual Objects

Published on: October 18, 2024

Area of Science:

Computer Vision
Artificial Intelligence
Machine Learning

Background:

Human-object interaction (HOI) detection is crucial for understanding complex scenes in images.
Current supervised methods struggle with generalization to unseen object categories due to reliance on extensive annotations.
Visual-language models (VLMs) show promise for zero-shot learning, offering a potential solution for HOI detection limitations.

Purpose of the Study:

To develop a novel framework, Knowledge Integration to HOI (KI2HOI), for improved zero-shot HOI detection.
To leverage the zero-shot capabilities of VLMs to enhance HOI detection performance.
To address the generalization limitations of current HOI detection methods.

Main Methods:

Proposed a novel framework, KI2HOI, integrating VLM knowledge for zero-shot HOI detection.
Introduced a ho-pair encoder to provide contextual and interaction-specific semantic representations.
Implemented two fusion strategies: visual-level fusion for global context and language-level fusion to enhance VLM capabilities.

Main Results:

KI2HOI demonstrated superior performance compared to existing methods on HICO-DET and V-COCO datasets.
The model achieved significant improvements in both zero-shot and full-supervised HOI detection settings.
The proposed fusion strategies effectively facilitated prior knowledge transfer from VLMs.

Conclusions:

The KI2HOI framework effectively integrates VLM knowledge to advance zero-shot HOI detection.
The method shows strong generalization capabilities, outperforming previous approaches.
This work offers a promising direction for more robust and adaptable HOI detection systems.