Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

Collisions in Multiple Dimensions: Problem Solving

Collisions in Multiple Dimensions: Problem Solving

In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...

Gestalt Principles of Perception

Gestalt Principles of Perception

Gestalt principles provide a framework for understanding how humans perceive objects as unified wholes within their context. These principles are essential in explaining the cognitive processes that make sense of complex visual stimuli by organizing them into coherent groups. One fundamental principle is proximity, which posits that objects located close to each other are perceived as a collective group. For instance, when dots are positioned near one another, the visual system interprets them...

Parallel Processing

Parallel Processing

The brain processes sensory information rapidly due to parallel processing, which involves sending data across multiple neural pathways at the same time. This method allows the brain to manage various sensory qualities, such as shapes, colors, movements, and locations, all concurrently. For instance, when observing a forest landscape, the brain simultaneously processes the movement of leaves, the shapes of trees, the depth between them, and the various shades of green. This enables a quick and...

Collisions in Multiple Dimensions: Introduction

Collisions in Multiple Dimensions: Introduction

It is far more common for collisions to occur in two dimensions; that is, the initial velocity vectors are neither parallel nor antiparallel to each other. Let's see what complications arise from this. The first idea is that momentum is a vector. Like all vectors, it can be expressed as a sum of perpendicular components (usually, though not always, an x-component and a y-component, and a z-component if necessary). Thus, when the statement of conservation of momentum is written for a...

Three-Dimensional Force System:Problem Solving

Three-Dimensional Force System:Problem Solving

A three-dimensional force system refers to a scenario in which three forces act simultaneously in three different directions. This type of problem is commonly encountered in physics and engineering, where it is necessary to calculate the resultant force on the system, which can then be used to predict or analyze the behavior of the object or structure under consideration.
To solve a three-dimensional force system, first resolve each force into its respective scalar components. Do this using...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Development of fluorescent methods for DNA methyltransferase assay.

Methods and applications in fluorescence·2017

Same author

Waist-hip Ratio (WHR), a Better Predictor for Prostate Cancer than Body Mass Index (BMI): Results from a Chinese Hospital-based Biopsy Cohort.

Scientific reports·2017

Same author

Chronic nicotine differentially affects murine transcriptome profiling in isolated cortical interneurons and pyramidal neurons.

BMC genomics·2017

Same author

Combinatorial Strategy to Identify Fluorescent Probes for Biothiol and Thiophenol Based on Diversified Pyrimidine Moieties and Their Biological Applications.

Analytical chemistry·2017

Same author

Fluorescence Imaging of Intracellular Telomerase Activity Using Enzyme-Free Signal Amplification.

Analytical chemistry·2017

Same author

Highly Specific and Ultrasensitive Two-Photon Fluorescence Imaging of Native HOCl in Lysosomes and Tissues Based on Thiocarbamate Derivatives.

Analytical chemistry·2017

Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026

Same journal

Self-Supervised Continuous Dynamic Graph Representation Learning via Hawkes Processes.

IEEE transactions on neural networks and learning systems·2026

Same journal

cPU: Consistent Risk Estimator for Positive-Unlabeled Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Tuning-Free Latent Diffusion Models for Ultrahigh-Resolution Image Editing.

IEEE transactions on neural networks and learning systems·2026

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 20, 2025

Photorealistic Learned Landscapes for Augmented Reality

Photorealistic Learned Landscapes for Augmented Reality

Published on: June 27, 2025

DPSNet: Multitask Learning Using Geometry Reasoning for Scene Depth and Semantics.

Junning Zhang, Qunxing Su, Bo Tang

IEEE Transactions on Neural Networks and Learning Systems

|September 16, 2021

Summary

This summary is machine-generated.

This study introduces DPSNet, a novel multitask learning method for joint depth, camera pose estimation, and semantic scene segmentation from monocular images. DPSNet significantly advances computer vision by effectively modeling geometric structures and achieving state-of-the-art results.

More Related Videos

Automated Visual Cognitive Tasks for Recording Neural Activity Using a Floor Projection Maze

Automated Visual Cognitive Tasks for Recording Neural Activity Using a Floor Projection Maze

Published on: February 20, 2014

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Related Experiment Videos

Last Updated: Oct 20, 2025

Photorealistic Learned Landscapes for Augmented Reality

Photorealistic Learned Landscapes for Augmented Reality

Published on: June 27, 2025

Automated Visual Cognitive Tasks for Recording Neural Activity Using a Floor Projection Maze

Automated Visual Cognitive Tasks for Recording Neural Activity Using a Floor Projection Maze

Published on: February 20, 2014

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Area of Science:

Computer Vision
Machine Learning
Deep Learning

Background:

Monocular depth estimation and semantic understanding are challenging computer vision tasks.
Existing joint learning frameworks often fail to model geometric structures due to limitations in learning camera motion.
Accurate scene understanding requires integrating depth, camera pose, and semantic information.

Purpose of the Study:

To propose DPSNet, a multitask learning method for joint depth estimation, camera pose estimation, and semantic scene segmentation from monocular images.
To address limitations in existing methods by incorporating geometric reasoning and camera motion learning.
To improve the accuracy and robustness of scene understanding in computer vision.

Main Methods:

Developed DPSNet, a novel multitask learning architecture.
Introduced a rigid semantic consistency loss for robust depth and camera pose prediction, overcoming limitations of pixel reconstruction.
Utilized multiscale geometric reasoning for accurate semantic scene segmentation.

Main Results:

DPSNet demonstrated state-of-the-art performance across all three tasks: depth estimation, camera pose estimation, and semantic scene segmentation.
Experiments on open-source and custom datasets validated the effectiveness of each component of DPSNet.
The proposed rigid semantic consistency loss proved effective in handling moving pixels and improving geometric modeling.

Conclusions:

DPSNet offers a significant advancement in multitask learning for monocular vision tasks.
The integration of geometric reasoning and semantic consistency enhances scene understanding capabilities.
The model's state-of-the-art performance highlights its potential for real-world applications in autonomous driving and robotics.