Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

GEOMETRY OF LONG-TAILED REPRESENTATION LEARNING: REBALANCING FEATURES FOR SKEWED DISTRIBUTIONS.

... International Conference on Learning Representations·2026

Same author

Synergistic effects of plaque geometry and composition on coronary hemodynamics and mechanical stability: a multiscale computational study.

Biomedical physics & engineering express·2026

Same author

Setup-Independent Full Projector Compensation.

IEEE transactions on visualization and computer graphics·2026

Same author

DiffPC: Diffusion-Based Projector Photometric Compensation.

IEEE transactions on visualization and computer graphics·2026

Same author

Probing uric acid-related prognostic genes and their molecular mechanisms in prostate cancer based on transcriptomic data.

Discover oncology·2026

Same author

Deployment Prior Injection for Run-Time Re-Biasable Object Detection.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Style-Aware Contrastive Test-Time Adaptation: A Dual-Cache Model for Robust Vision-Language Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Semantic Frame Interpolation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Physics-Guided Cross-Modal Decoupling with Test-Time Adaptation for Hyperspectral Image Restoration.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 25, 2025

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Toward High Quality Multi-Object Tracking and Segmentation Without Mask Supervision.

Wensheng Cheng, Yi Wu, Zhenyu Wu

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|May 27, 2024

Summary

This summary is machine-generated.

This study introduces BoxMOTS, a novel framework for weakly supervised multi-object tracking and segmentation that uses only bounding box annotations, overcoming limitations of previous methods by fully exploiting temporal information for improved accuracy.

More Related Videos

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

Deep Learning-Based Segmentation of Cryo-Electron Tomograms

Deep Learning-Based Segmentation of Cryo-Electron Tomograms

Published on: November 11, 2022

Related Experiment Videos

Last Updated: Jun 25, 2025

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

Deep Learning-Based Segmentation of Cryo-Electron Tomograms

Deep Learning-Based Segmentation of Cryo-Electron Tomograms

Published on: November 11, 2022

Area of Science:

Computer Vision
Machine Learning
Artificial Intelligence

Background:

Weakly supervised multi-object tracking and segmentation methods often suffer from coarse pseudo mask labels and underutilization of temporal information.
Existing approaches struggle with accurate segmentation and robust tracking due to these inherent limitations.

Purpose of the Study:

To develop a novel framework, BoxMOTS, that addresses the limitations of current weakly supervised multi-object tracking and segmentation methods.
To eliminate the need for pseudo mask labels by directly utilizing bounding box annotations for segmentation supervision.
To enhance the utilization of temporal information for improved mask quality and data association in tracking.

Main Methods:

A framework that directly uses bounding box labels to supervise the segmentation network, avoiding pseudo mask labels.
Integration of optical flow-based pairwise consistency to ensure mask consistency across frames, enhancing segmentation quality.
A temporally adjacent pair-based sampling strategy for instance embedding learning, optimizing data association in tracking.
An end-to-end deep model, BoxMOTS, combining these techniques for unified tracking and segmentation.

Main Results:

BoxMOTS achieves state-of-the-art performance, significantly outperforming existing methods on benchmark datasets.
The model demonstrates promising results on the KITTI MOTS and BDD100K MOTS datasets, validating its effectiveness.
The proposed approach successfully utilizes only box annotations, eliminating the requirement for mask supervision.

Conclusions:

BoxMOTS offers a more efficient and effective approach to weakly supervised multi-object tracking and segmentation.
The framework successfully overcomes the drawbacks of coarse pseudo mask labels and limited temporal information utilization.
The model's ability to perform accurate tracking and segmentation using only box annotations represents a significant advancement in the field.