Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Aggregating global-scale pixel-wise forgery cues within a graph.

Neural networks : the official journal of the International Neural Network Society·2026

Same author

DiMuS: Disentangled Multi-Signal Learning for Weakly Supervised Point-Based 3D Object Detection.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Visual-Textual Information-Driven Tactile Data Generation Method.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Class Sensitive Calibration and Discrepancy-Aware Synthesis for Semi-Supervised Medical Image Segmentation.

IEEE journal of biomedical and health informatics·2026

Same author

Diffusion-based cross-staining feature transformation for whole slide image analysis: From H&E to IHC representation learning.

Medical image analysis·2026

Same author

SD-ReID: View-Aware Stable Diffusion for Aerial-Ground Person Re-Identification.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026

Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026

Same journal

Hierarchical Semantic Concept Modeling for Generalizable Myocardial Pathology Segmentation on Multisequence CMR Images.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 21, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation.

Yunzhi Zhuge, Hongyu Gu, Lu Zhang

IEEE Transactions on Neural Networks and Learning Systems

|July 8, 2024

Summary

This summary is machine-generated.

MTNet efficiently segments objects in videos by combining motion, appearance, and temporal cues. This unsupervised video object segmentation (UVOS) method achieves state-of-the-art results and is versatile for related tasks.

More Related Videos

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Long-term Video Tracking of Cohoused Aquatic Animals: A Case Study of the Daily Locomotor Activity of the Norway Lobster Nephrops norvegicus

Long-term Video Tracking of Cohoused Aquatic Animals: A Case Study of the Daily Locomotor Activity of the Norway Lobster Nephrops norvegicus

Published on: April 8, 2019

Related Experiment Videos

Last Updated: Jun 21, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Long-term Video Tracking of Cohoused Aquatic Animals: A Case Study of the Daily Locomotor Activity of the Norway Lobster Nephrops norvegicus

Long-term Video Tracking of Cohoused Aquatic Animals: A Case Study of the Daily Locomotor Activity of the Norway Lobster Nephrops norvegicus

Published on: April 8, 2019

Area of Science:

Computer Vision
Artificial Intelligence

Background:

Unsupervised Video Object Segmentation (UVOS) faces challenges in accurately segmenting objects without labeled data.
Existing methods often focus on either appearance-motion integration or temporal modeling, limiting comprehensive understanding.

Purpose of the Study:

To propose an efficient algorithm, MTNet, for unsupervised video object segmentation.
To concurrently exploit motion, appearance, and temporal cues within a unified framework.

Main Methods:

MTNet merges appearance and motion features in encoders for complementary representations.
A temporal transformer module captures long-range contextual dynamics and inter-frame interactions.
A cascade of decoders at all feature levels refines segmentation masks.

Main Results:

MTNet achieves state-of-the-art performance in unsupervised video object segmentation (UVOS).
The method demonstrates competitive results in video salient object detection (VSOD).
Achieves accurate object localization and tracking in challenging scenarios.

Conclusions:

MTNet offers a robust and efficient framework for UVOS by integrating diverse cues.
The method's versatility extends to other video segmentation tasks.
The proposed approach effectively leverages temporal and cross-modality knowledge.