Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Jan 8, 2026

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
04:23

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

2.2K

Fine-Grained Visual Classification via Adaptive Attention Quantization Transformer.

Shishi Qiao, Shixian Li, Haiyong Zheng

    IEEE Transactions on Neural Networks and Learning Systems
    |December 17, 2025
    PubMed
    Summary
    This summary is machine-generated.

    Related Concept Videos

    You might also read

    Related Articles

    Articles linked to this work by shared authors, journal, and citation graph.

    Sort by
    Same author

    Effects of hydrokinesitherapy on balance and walking ability for stroke survivors: update of a systematic review and meta-analysis of randomized controlled studies.

    European journal of physical and rehabilitation medicine·2026
    Same author

    Revisiting Face Forgery Detection: From Facial Representation to Forgery Detection.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same author

    Association between preoperative prognostic nutritional index and perioperative outcomes after unicompartmental knee arthroplasty for medial knee osteoarthritis: A retrospective single-center study.

    Medicine·2026
    Same author

    Yiqi Xugu HeJi restores cartilage metabolic homeostasis via AKT1-Thr473 activation in osteoarthritis.

    Phytomedicine : international journal of phytotherapy and phytopharmacology·2025
    Same author

    Dual-circularly polarized flat-top-beam transmitarray antenna with flexible energy allocations.

    Optics express·2025
    Same author

    Ensembling a Learned Volterra Polynomial with a Neural Network for Joint Nonlinear Distortions and Mismatch Errors Calibration of Time-Interleaved Pipelined ADCs.

    Sensors (Basel, Switzerland)·2025
    Same journal

    Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

    IEEE transactions on neural networks and learning systems·2026
    Same journal

    CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

    IEEE transactions on neural networks and learning systems·2026
    Same journal

    Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

    IEEE transactions on neural networks and learning systems·2026
    Same journal

    A Survey on Human-Centric Voice-Face Multimodal Learning.

    IEEE transactions on neural networks and learning systems·2026
    Same journal

    Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

    IEEE transactions on neural networks and learning systems·2026
    Same journal

    FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

    IEEE transactions on neural networks and learning systems·2026
    See all related articles

    Vision Transformer (ViT) models can be improved for fine-grained visual classification (FGVC) by adaptively selecting discriminative features. Our A2QTrans method enhances attention mechanisms to focus on key image regions, achieving state-of-the-art results.

    Area of Science:

    • Computer Vision
    • Machine Learning
    • Artificial Intelligence

    Background:

    • Vision Transformers (ViT) excel in fine-grained visual classification (FGVC).
    • Existing ViT methods often struggle with attention heads focusing on non-discriminative regions, diluting crucial signals.
    • This necessitates improved attention mechanisms for more effective FGVC.

    Purpose of the Study:

    • To propose a novel Adaptive Attention Quantization Transformer (A2QTrans) for FGVC.
    • To enhance feature selection by analyzing and optimizing attention head behavior.
    • To achieve state-of-the-art performance in fine-grained visual classification tasks.

    Main Methods:

    • Introduced the Adaptive Quantization Selection (AQS) module to dynamically select discriminative features via attention score quantization.

    Related Experiment Videos

    Last Updated: Jan 8, 2026

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
    04:23

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

    Published on: April 21, 2023

    2.2K
  • Employed a Straight-Through Estimator (STE) for discrete optimization within the AQS module, enabling end-to-end training.
  • Developed a Background Elimination (BE) module to refine attention focus on salient objects and a Dynamic Hybrid Optimization (DHO) module for integrating results.
  • Main Results:

    • A2QTrans demonstrated superior performance across four challenging FGVC benchmark datasets.
    • The method achieved state-of-the-art (SOTA) results when tested on three ViT variants.
    • The proposed modules effectively filtered irrelevant information and concentrated attention on discriminative regions.

    Conclusions:

    • A2QTrans offers a significant advancement in ViT-based FGVC by intelligently managing attention mechanisms.
    • The method's ability to select key discriminative features leads to improved classification accuracy.
    • A2QTrans provides a robust framework for enhancing visual classification tasks.