Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Jun 26, 2026

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Perception Assisted Transformer for Unsupervised Object Re-Identification.

Shuoyi Chen, Mang Ye, Xingping Dong

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |March 27, 2025
    PubMed
    Summary
    This summary is machine-generated.

    Related Concept Videos

    You might also read

    Related Articles

    Articles linked to this work by shared authors, journal, and citation graph.

    Sort by
    Same author

    LoRASculpt: Harmonious Low-Rank Adaptation for Multimodal Large Language Models.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same author

    Towards clinical-level interpretation of dental panoramic radiography using an instance-guided vision-language model.

    Nature biomedical engineering·2026
    Same author

    Systemic immune-inflammation index predicts post-thrombectomy outcomes and reveals a mediating role in the association between neurocardiac stress and prognosis: a multicenter study.

    Frontiers in neurology·2026
    Same author

    HiSymGeo: Hierarchical Context Symbiosis for Cross-View Object-Level Image Geo-Localization.

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
    Same author

    Holistic Invariant Retracing for Distortion-Resilient Multi-Modal Learning in Spatial Transcriptomics.

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
    Same author

    Differentiable Clustering Graph Convolutional Network for Hyperspectral Unmixing: Methodology and Benchmark.

    IEEE transactions on neural networks and learning systems·2026
    Same journal

    Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
    Same journal

    AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
    Same journal

    BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
    Same journal

    GoP-based Quality Enhancement on Video Compression.

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
    Same journal

    Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
    Same journal

    Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
    See all related articles

    This study introduces a Transformer-based framework for unsupervised object re-identification (Re-ID), enhancing feature learning with a novel mask alignment strategy. The proposed method achieves superior performance, outperforming many supervised approaches without identity annotations.

    Area of Science:

    • Computer Vision
    • Machine Learning
    • Artificial Intelligence

    Background:

    • Unsupervised object re-identification (Re-ID) traditionally uses Convolutional Neural Networks (CNNs) for feature extraction and pseudo-labeling.
    • CNNs have limitations in capturing long-range dependencies and integrating global information, hindering performance in complex scenarios.
    • Vision Transformers (ViTs) offer superior robustness and modeling capabilities for diverse data structures, showing promise for Re-ID tasks.

    Purpose of the Study:

    • To explore the potential of Vision Transformers in unsupervised object re-identification (Re-ID).
    • To propose a novel Transformer-based framework (PAT) that enhances feature learning beyond category-level supervision.
    • To improve fine-grained feature alignment and instance-level discriminative learning in unsupervised Re-ID.

    More Related Videos

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
    04:23

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

    Published on: April 21, 2023

    End-To-End Deep Neural Network for Salient Object Detection in Complex Environments
    03:31

    End-To-End Deep Neural Network for Salient Object Detection in Complex Environments

    Published on: December 15, 2023

    Related Experiment Videos

    Last Updated: Jun 26, 2026

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
    04:23

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

    Published on: April 21, 2023

    End-To-End Deep Neural Network for Salient Object Detection in Complex Environments
    03:31

    End-To-End Deep Neural Network for Salient Object Detection in Complex Environments

    Published on: December 15, 2023

    Main Methods:

    • Proposed a Transformer-based perception-assisted framework (PAT) for unsupervised Re-ID.
    • Introduced a target-aware mask alignment (TMA) strategy to leverage low-level visual cues and guide fine-grained feature alignment using pseudo-labels.
    • Developed a perceptual fusion feature augmentation (PFA) method to optimize instance-level discriminative learning.

    Main Results:

    • The PAT framework demonstrated superior performance and robustness on multiple Re-ID datasets compared to state-of-the-art methods.
    • The proposed TMA strategy effectively incorporates local pixel information for improved discriminative feature learning.
    • The method achieved results comparable to or better than many supervised Re-ID approaches, despite being unsupervised.

    Conclusions:

    • Vision Transformers are highly effective for unsupervised object re-identification, particularly when combined with strategies that enhance fine-grained feature learning.
    • The proposed PAT framework, incorporating TMA and PFA, offers a powerful approach for unsupervised Re-ID by balancing discriminative learning and detailed understanding.
    • The method's ability to achieve strong performance without identity annotations highlights its potential for practical applications.