Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Vaginal microbiota and genitourinary syndrome of menopause in premenopausal breast cancer patients receiving endocrine therapy: a longitudinal cohort study protocol.

Frontiers in medicine·2026
Same author

2s-DAS: Two-Stream Diffusion with Multi-Modal Fusion for Temporal Action Segmentation.

Journal of imaging·2026
Same author

Intravenous administration of an engineered AAV9-gene-silencing vector suppresses human SOD1 and extends survival in an ALS mouse model.

Nature communications·2026
Same author

Lamellar Regulation for Fast and Reversible Zinc-Ion Transport in Water-Rich Hydrogels for Aqueous Zinc-Ion Batteries.

Small (Weinheim an der Bergstrasse, Germany)·2026
Same author

Electroacupuncture Ameliorates Learning and Memory Deficits in Vascular Cognitive Impairment Rats Through Activation of the Supramammillary Nucleus-Dentate Gyrus Circuit.

CNS neuroscience & therapeutics·2026
Same author

Identification and Validation of Hub Ferroptosis‑Related Genes in Sepsis: An Integrated Bioinformatics and Experimental Study.

Current molecular medicine·2026
Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026
Same journal

Self-Supervised Continuous Dynamic Graph Representation Learning via Hawkes Processes.

IEEE transactions on neural networks and learning systems·2026
Same journal

cPU: Consistent Risk Estimator for Positive-Unlabeled Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Tuning-Free Latent Diffusion Models for Ultrahigh-Resolution Image Editing.

IEEE transactions on neural networks and learning systems·2026
Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Video

Updated: Aug 20, 2025

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

605

TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

Yuan Yao, Fang Wan, Wei Gao

    IEEE Transactions on Neural Networks and Learning Systems
    |November 23, 2022
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces the Vision Transformer for weakly supervised object localization (WSOL), overcoming CNN limitations in capturing full object extents. The proposed Token Semantic Coupled Attention Map (TS-CAM) method significantly improves object localization accuracy and multicategory performance.

    More Related Videos

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.1K
    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
    12:39

    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

    Published on: January 18, 2020

    7.8K

    Related Experiment Videos

    Last Updated: Aug 20, 2025

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
    03:31

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

    Published on: December 15, 2023

    605
    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.1K
    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
    12:39

    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

    Published on: January 18, 2020

    7.8K

    Area of Science:

    • Computer Vision
    • Machine Learning
    • Artificial Intelligence

    Background:

    • Weakly supervised object localization (WSOL) using only image category labels is challenging.
    • Convolutional Neural Networks (CNNs) often fail to localize the full object extent, focusing instead on discriminative parts due to difficulties in capturing long-range semantic dependencies.

    Purpose of the Study:

    • To address the limitations of CNNs in WSOL by leveraging the Vision Transformer (ViT).
    • To propose a novel method, Token Semantic Coupled Attention Map (TS-CAM), for improved object localization.

    Main Methods:

    • Introduced the Vision Transformer to WSOL to capture long-range semantic dependencies via self-attention.
    • Developed TS-CAM, which decomposes class-aware semantics and couples them with attention maps for semantic-aware activation.
    • Employed spatial embedding by partitioning images into patch tokens and reallocating category semantics to these tokens for improved long-distance feature capture.

    Main Results:

    • TS-CAM significantly outperformed CNN-based methods, achieving 11.6% and 28.9% improvements on ILSVRC and CUB-200-2011 datasets, respectively.
    • Demonstrated state-of-the-art performance in WSOL.
    • Showcased superior performance for multicategory object localization on the Pascal VOC dataset.

    Conclusions:

    • The Vision Transformer, through the TS-CAM method, effectively captures long-range semantic dependencies for robust object localization.
    • TS-CAM overcomes the limitations of CNNs in WSOL, providing accurate localization of full object extents.
    • The proposed approach offers significant advancements in both single- and multicategory weakly supervised object localization.