Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

GPR177 in A-fiber sensory neurons drives diabetic neuropathic pain via WNT-mediated TRPV1 activation.

Science translational medicine·2022
Same author

Lesion-attention pyramid network for diabetic retinopathy grading.

Artificial intelligence in medicine·2022
Same author

Development and Validation of a Nomogram to Predict Cancer-Specific Survival for Middle-Aged Patients With Early-Stage Hepatocellular Carcinoma.

Frontiers in public health·2022
Same author

Expanding the DNA-encoded library toolbox: identifying small molecules targeting RNA.

Nucleic acids research·2022
Same author

Depression duration and risk of incident cardiovascular disease: A population-based six-year cohort study.

Journal of affective disorders·2022
Same author

Adverse Events Reporting of Clinical Trials in Exercise Oncology Research (ADVANCE): Protocol for a Scoping Review.

Frontiers in oncology·2022
Same journal

DARUMA: a gateway to fast and easy prediction of intrinsically disordered regions.

PeerJ. Computer science·2026
Same journal

Alzheimer's disease detection using a quantum deep neural network with Haralick feature extraction and simulated annealing optimization.

PeerJ. Computer science·2026
Same journal

Network anomaly detection using Deep Autoencoder and parallel Artificial Bee Colony algorithm-trained neural network.

PeerJ. Computer science·2026
Same journal

An anomaly detection model for multivariate time series with anomaly perception.

PeerJ. Computer science·2026
Same journal

Retraction: A wormhole attack detection method for tactical wireless sensor networks.

PeerJ. Computer science·2026
Same journal

Evaluation of mental disorder with prioritization of its type by utilizing the bipolar complex fuzzy decision-making approach based on Schweizer-Sklar prioritized aggregation operators.

PeerJ. Computer science·2026
See all related articles

Related Experiment Video

Updated: Jul 8, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
04:23

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

1.9K

Improved deep learning image classification algorithm based on Swin Transformer V2.

Jiangshu Wei1, Jinrong Chen1, Yuchao Wang2

  • 1College of Information Engineering, Sichuan Agricultural University, Ya'an, Sichuan, China.

Peerj. Computer Science
|December 11, 2023
PubMed
Summary
This summary is machine-generated.

This study enhances the Swin Transformer V2 model by integrating convolutional neural networks (CNNs) with Transformers. The combined approach improves image classification accuracy and generalization by capturing both local and global features.

Keywords:
Attention mechanismConvolutional neural networksImage classificationTransformer

More Related Videos

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
04:48

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

406
Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

1.5K

Related Experiment Videos

Last Updated: Jul 8, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
04:23

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

1.9K
Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
04:48

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

406
Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

1.5K

Area of Science:

  • Computer Vision
  • Deep Learning
  • Artificial Intelligence

Background:

  • Convolutional Neural Networks (CNNs) excel at extracting local image features but struggle with global dependencies due to limited receptive fields.
  • Transformers effectively model global dependencies but lack inherent mechanisms for local information exchange within specific regions.
  • Existing models face challenges in simultaneously optimizing local feature extraction and global dependency modeling for image classification.

Purpose of the Study:

  • To enhance the Swin Transformer V2 model by synergistically combining the strengths of CNNs and Transformers.
  • To improve the model's ability to capture both local image features and long-range dependencies.
  • To boost image classification accuracy and generalization capabilities.

Main Methods:

  • Integration of convolutional operations and self-attention mechanisms into the Swin Transformer V2 architecture.
  • Introduction of Swin Transformer Stem, inverted residual feed-forward network, and Dual-Branch Downsampling for enhanced local information extraction.
  • Application of downsampling to the attention mechanism's Query (Q) and Key (K) to reduce computational and memory overhead.

Main Results:

  • Significant improvements in classification accuracy across multiple image classification datasets under identical training conditions.
  • Demonstrated enhanced extraction of local information through the novel architectural components.
  • Showcased more robust generalization capabilities compared to baseline models.

Conclusions:

  • The proposed hybrid approach effectively leverages the complementary strengths of CNNs and Transformers for superior image classification.
  • The enhanced Swin Transformer V2 model offers a more comprehensive feature representation by integrating local and global modeling.
  • The method provides a promising direction for developing more powerful and efficient deep learning models for computer vision tasks.