Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Force Classification01:22

Force Classification

2.5K
Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...
2.5K
Structural Classification of Joints01:20

Structural Classification of Joints

7.8K
Joints, also known as articulations, are classified based on their structural characteristics, i.e., based on whether the articulating surfaces of the adjacent bones are directly connected by fibrous connective tissue or cartilage, or whether the articulating surfaces contact each other within a fluid-filled joint cavity. These differences serve to divide the joints of the body into three structural classifications.
A fibrous joint is where the adjacent bones are united by fibrous connective...
7.8K
Functional Classification of Joints01:09

Functional Classification of Joints

8.3K
Functional Classification of Joints
The functional classification of joints is determined by the amount of mobility between the adjacent bones. Joints are functionally classified as a synarthrosis or immobile joint, an amphiarthrosis or slightly moveable joint, or as a diarthrosis, a freely moveable joint. Fibrous and cartilaginous joints can be functionally classified as either synarthroses  or amphiarthroses, whereas all synovial joints are classified as diarthroses.
Synarthrosis
An...
8.3K
Muscle Coordination and Action01:24

Muscle Coordination and Action

3.4K
Muscle coordination is a complex and finely tuned process essential for smooth and purposeful movements like flexion, extension, adduction, abduction, and rotation. The human body orchestrates the actions of various muscles working in concert, each with a specific role. Four functional types describe how muscles work together: agonist, antagonist, synergist, and fixator.
Agonists
Agonist muscles, often called prime movers, are the primary muscles responsible for producing a specific movement....
3.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

BRISC: Annotated Dataset for Brain Tumor Segmentation and Classification.

Scientific data·2026
Same author

Efficient and Accurate Pneumonia Detection Using a Novel Multi-Scale Transformer Approach.

Sensors (Basel, Switzerland)·2025
Same author

HybridBranchNetV2: Towards reliable artificial intelligence in image classification using reinforcement learning.

PloS one·2025
Same author

Skin Cancer Diagnosis Based on Neutrosophic Features with a Deep Neural Network.

Sensors (Basel, Switzerland)·2022
Same author

Muscle force estimation from lower limb EMG signals using novel optimised machine learning techniques.

Medical & biological engineering & computing·2022
Same author

ResBCDU-Net: A Deep Learning Framework for Lung CT Image Segmentation.

Sensors (Basel, Switzerland)·2021
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Feb 28, 2026

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.9K

Lightweight Multi-Scale Framework for Human Pose and Action Classification.

Alireza Saber1, Mohammad-Mehdi Hosseini2, Amirreza Fateh3

  • 1Faculty of Computer Engineering, Shahrood University of Technology, Shahrood 36199-95161, Iran.

Sensors (Basel, Switzerland)
|February 27, 2026
PubMed
Summary
This summary is machine-generated.

This study introduces a lightweight, attention-based deep learning model for human pose classification. The novel architecture achieves superior accuracy on benchmark datasets while using minimal parameters.

Keywords:
classificationhuman poselightweightmulti-scale

Frequently Asked Questions

More Related Videos

Estimation of Contact Regions Between Hands and Objects During Human Multi-Digit Grasping
09:41

Estimation of Contact Regions Between Hands and Objects During Human Multi-Digit Grasping

Published on: April 21, 2023

2.3K
Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

5.5K

Related Experiment Videos

Last Updated: Feb 28, 2026

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis
05:41

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

9.9K
Estimation of Contact Regions Between Hands and Objects During Human Multi-Digit Grasping
09:41

Estimation of Contact Regions Between Hands and Objects During Human Multi-Digit Grasping

Published on: April 21, 2023

2.3K
Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

5.5K

Area of Science:

  • Computer Vision and Deep Learning
  • Human-Computer Interaction through human pose classification
  • Artificial Intelligence for activity recognition

Background:

It was already known that deep learning architectures facilitate significant advancements in assisting human activities through automated recognition and monitoring. These computational systems must distinguish between subtle movements while managing high inter-class similarity across diverse datasets containing thousands of unique samples. Existing frameworks often struggle with inherent dataset noise and the extensive variability present in physical orientations across different demographic groups. Robust multi-scale feature extraction remains difficult when developers attempt to balance model complexity with the strict real-time processing requirements of mobile devices. Traditional convolutional networks frequently fail to capture long-range dependencies necessary for understanding complex body mechanics in static images. The lack of interpretability in black-box models further complicates their adoption in sensitive fields like healthcare or physical therapy. This absence of evidence motivated the development of a more efficient, modular approach to handle these specific structural and computational complexities.

Purpose Of The Study:

This research introduces a lightweight modular attention-based architecture designed to enhance human pose classification accuracy without increasing computational costs. The investigators sought to build a system upon a Swin Transformer backbone to ensure robust feature extraction across multiple spatial scales simultaneously. By integrating specialized attention modules, the framework aims to fuse spatial and channel-wise information more effectively than previous monolithic iterations. The project prioritizes reducing the total parameter count to facilitate seamless deployment on resource-constrained hardware like edge computing nodes. Implementation of Explainable Artificial Intelligence (XAI) techniques serves to increase the interpretability and reliability of the resulting classifications for end-users. The study addresses the specific challenge of high inter-class similarity by refining how the network perceives subtle differences in joint positioning. Researchers intended to validate this design against diverse benchmarks to prove its versatility in both yoga poses and general daily actions.

Main Methods:

The experimental design utilizes a Swin Transformer backbone to perform multi-scale feature extraction from input images representing various physical activities. Researchers integrated a Spatial Attention (SA) module alongside a Context-Aware Channel Attention Module (CACAM) to capture diverse data relationships within the feature maps. A novel Dual Weighted Cross Attention (DWCA) component facilitates the fusion of spatial and channel-wise cues within the hierarchical network structure. The team evaluated the performance of this modular design using the Yoga-82 dataset in both 6-class and 20-class configurations for granularity. Validation also involved testing on the Stanford 40 Actions dataset to ensure generalizability across a wide spectrum of human movement categories. The methodology included the application of explainable AI techniques to visualize the decision-making process and identify which body parts influenced the final output. Statistical comparisons were conducted against several state-of-the-art baselines to measure improvements in precision, recall, and the F1-score.

Main Results:

The proposed framework outperformed state-of-the-art baselines across metrics including precision, recall, F1-score, and mean Average Precision (mAP) during rigorous testing. This superior performance was achieved while maintaining an extremely low parameter count of only 0.79 million, making it highly efficient. For the 6-class Yoga-82 configuration, the model reached a classification accuracy of 90.40%, demonstrating high reliability in broad category identification. The 20-class version of the same dataset yielded a success rate of 87.44% under the new architecture, even with increased label complexity. Testing on the Stanford 40 Actions dataset resulted in a peak accuracy of 94.28% for the multi-scale system across diverse activity types. Quantitative analysis showed that the Dual Weighted Cross Attention (DWCA) module significantly contributed to the overall gain in predictive power. The integration of the Context-Aware Channel Attention Module (CACAM) allowed the system to ignore irrelevant background noise more effectively than standard models.

Conclusions:

These findings suggest that modular attention mechanisms can significantly improve the efficiency and accuracy of human pose classification systems in real-world settings. The reduction in parameter count demonstrates that high accuracy does not necessitate excessive computational overhead in modern deep learning models. Incorporating explainable techniques provides a pathway for more transparent and trustworthy artificial intelligence in practical applications like remote health monitoring. Future efforts may focus on expanding these multi-scale strategies to even more complex action recognition scenarios involving temporal sequences. The study establishes a new benchmark for balancing performance and resource consumption in computer vision tasks related to human movement. This modular design offers a scalable solution for developers looking to implement sophisticated classification tools on low-power consumer electronics. The researchers conclude that the fusion of spatial and channel-wise cues is essential for overcoming the limitations of traditional pose estimation frameworks.

The system utilizes a Dual Weighted Cross Attention (DWCA) module to fuse spatial and channel-wise cues. This allows the Swin Transformer backbone to better distinguish between similar poses by focusing on specific joint relationships and contextual features across multiple scales.

According to the study's findings, the framework attained a peak accuracy of 94.28% on the Stanford 40 Actions dataset. This was accomplished while maintaining a low parameter count of 0.79 million, outperforming several state-of-the-art baselines in precision and recall.

The researchers selected the Swin Transformer backbone to enable robust multi-scale feature extraction from images. This specific hierarchical design allows the model to capture both local and global dependencies, which is necessary for resolving high inter-class similarity in complex human poses.

The findings are primarily validated using the Yoga-82 and Stanford 40 Actions datasets, which focus on static pose and action classification. The authors imply that further investigation is required to determine how this lightweight architecture performs in dynamic, temporal-based action recognition scenarios.

The study's authors propose that high-performance human pose classification can be achieved with minimal computational overhead. They state that integrating explainable AI techniques and modular attention will facilitate the deployment of reliable computer vision tools on resource-constrained mobile and edge devices.