Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Aggregating global-scale pixel-wise forgery cues within a graph.

Neural networks : the official journal of the International Neural Network Society·2026

Same author

DiMuS: Disentangled Multi-Signal Learning for Weakly Supervised Point-Based 3D Object Detection.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Visual-Textual Information-Driven Tactile Data Generation Method.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Class Sensitive Calibration and Discrepancy-Aware Synthesis for Semi-Supervised Medical Image Segmentation.

IEEE journal of biomedical and health informatics·2026

Same author

Diffusion-based cross-staining feature transformation for whole slide image analysis: From H&E to IHC representation learning.

Medical image analysis·2026

Same author

SD-ReID: View-Aware Stable Diffusion for Aerial-Ground Person Re-Identification.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 30, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

From Pixels to Semantics: Self-Supervised Video Object Segmentation With Multiperspective Feature Mining.

Ruoqi Li, Yifan Wang, Lijun Wang

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|September 2, 2022

Summary

This summary is machine-generated.

This study introduces a novel self-supervised framework for one-shot video object segmentation (O-VOS). It combines pixel-level and semantic-level adaptation for improved mask propagation and achieves state-of-the-art results.

More Related Videos

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Related Experiment Videos

Last Updated: Aug 30, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Area of Science:

Computer Vision
Machine Learning
Artificial Intelligence

Background:

Current self-supervised methods for one-shot video object segmentation (O-VOS) frame the task as pixel-level matching.
This approach is limited because O-VOS requires semantic correspondence more than precise pixel matching.

Purpose of the Study:

To develop a novel self-supervised framework that integrates pixel-level correspondence learning with semantic-level adaptation for improved O-VOS.
To enhance feature reliability and suppress noise for more robust image matching in video segmentation.

Main Methods:

Implemented a self-supervised framework combining pixel-level correspondence learning via photometric reconstruction of adjacent RGB frames during offline training.
Incorporated semantic-level adaptation at test-time by enforcing bi-directional agreement of predicted segmentation masks.
Proposed a new network architecture featuring a multi-perspective feature mining mechanism to enhance reliable features and reduce noisy ones.

Main Results:

Achieved state-of-the-art performance on widely adopted datasets for one-shot video object segmentation.
Demonstrated the effectiveness of the integrated pixel-level and semantic-level adaptation approach.
Showcased the benefits of the multi-perspective feature mining mechanism for robust image matching.

Conclusions:

The proposed self-supervised framework effectively bridges the gap between self-supervised and fully supervised methods in O-VOS.
The integration of semantic-level adaptation significantly improves segmentation mask propagation.
The novel network architecture contributes to more robust and accurate video object segmentation.