Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Machine Learning-Based Approach for Identifying Research Gaps: COVID-19 as a Case Study.

JMIR formative research·2024

Same author

Artificial Intelligence-Based Methods for Integrating Local and Global Features for Brain Cancer Imaging: Scoping Review.

JMIR medical informatics·2023

Same author

Investigating the Precise Identification of Citrullination Sites with High- Performance Score Metrics Using a Powerful Computation Predicting Tool.

Combinatorial chemistry & high throughput screening·2023

Same author

Feature selection enhancement and feature space visualization for speech-based emotion recognition.

PeerJ. Computer science·2022

Same author

Multi-Stage Temporal Convolution Network for COVID-19 Variant Classification.

Diagnostics (Basel, Switzerland)·2022

Same author

Combating COVID-19 Using Generative Adversarial Networks and Artificial Intelligence for Medical Images: Scoping Review.

JMIR medical informatics·2022

Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026

Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026

Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026

Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026

Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026

Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 27, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

A Deep Sequence Learning Framework for Action Recognition in Small-Scale Depth Video Dataset.

Mohammad Farhad Bulbul^1,2, Amin Ullah³, Hazrat Ali⁴

¹Department of Computer Science and Engineering, Pohang University of Science and Technology (POSTECH), 77 Cheongam, Pohang 37673, Korea.

Sensors (Basel, Switzerland)

|September 23, 2022

Summary

This summary is machine-generated.

This study introduces a novel deep learning model for human action recognition using depth video sequences. The method effectively classifies actions even with limited training data, outperforming existing approaches.

Keywords:

3D action recognition CNN RNN attention bi-directional LSTM depth map sequence transfer learning

More Related Videos

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

Related Experiment Videos

Last Updated: Aug 27, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

Area of Science:

Computer Vision
Machine Learning
Deep Learning

Background:

Deep learning models for human action recognition typically rely on RGB or skeleton data, with depth video models being less common.
Training deep models with limited data is a significant challenge in depth-based action recognition research.
Existing methods often struggle with the scarcity of depth video datasets.

Purpose of the Study:

To propose a novel deep learning model for direct depth video sequence classification.
To address the challenge of limited training data in depth-based human action recognition.
To improve the efficacy and performance of action recognition using depth data.

Main Methods:

A four-stream representation is created by transforming depth videos into multi-view temporal motion frames.
DenseNet121 with ImageNet pre-trained weights extracts frame-level features from depth and motion streams.
Features are processed through bi-directional Long Short-Term Memory (BLSTM) networks and multi-head self-attention (MHSA) for temporal analysis and correlation capture.

Main Results:

The proposed framework demonstrates efficacy on small-scale depth datasets (MSRAction3D, DHA).
The model achieves superior performance compared to existing depth data-based action recognition methods.
Effective action classification is achieved even with insufficient training samples.

Conclusions:

The developed deep model offers a viable solution for human action recognition using depth video sequences, especially in data-limited scenarios.
The four-stream representation and attention mechanisms enhance the model's ability to capture complex temporal dynamics.
This work advances the field of depth-based action recognition by providing a robust and efficient framework.