Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Vision01:24

Vision

53.1K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
53.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Neurohormonal-Immune Dysregulation in Rosacea: Emerging Perspectives from the Skin-Gut-Brain Axis.

Drug design, development and therapy·2026
Same author

Acteoside inhibits the progression of squamous cell carcinoma by regulating the TWEAK/cIAP1/NF-κB signaling axis and secretion of IL-6, IL-8 and VEGF.

International immunopharmacology·2026
Same author

A Review of Aryl Hydrocarbon Receptor-Mediated Immune Regulation in Cutaneous Squamous Cell Carcinoma Progression.

ImmunoTargets and therapy·2026
Same author

Trends in diet quality and associated comprehensive environmental impacts in the United States, 2001 to 2018: a serial cross-sectional study.

The American journal of clinical nutrition·2026
Same author

Recent Advances in Novel Drug Delivery Systems for the Management of Cutaneous Squamous Cell Carcinoma.

International journal of nanomedicine·2026
Same author

Influenza forecasting method based on dual-chan nel feature fusion of VMD decomposition.

Scientific reports·2026
Same journal

Turbulent flow in a vortex separator with a directed pipe inlet.

Scientific reports·2026
Same journal

Systematic characteristic evaluation of clay-based cementitious material derived from calcium carbide residue and waste tile powder.

Scientific reports·2026
Same journal

Retraction Note: Improvement of a rapid diagnostic application of monoclonal antibodies against avian influenza H7 subtype virus using Europium nanoparticles.

Scientific reports·2026
Same journal

Applying large language models to spam detection in the Kazakh low-resource language setting.

Scientific reports·2026
Same journal

An open-source 3D printing system enabling in-situ freeze-thaw processing of hydrogels.

Scientific reports·2026
Same journal

An enhanced EfficientNet framework for automated waste classification using cosine annealing and label smoothing.

Scientific reports·2026
See all related articles

Related Experiment Video

Updated: Jun 22, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
04:23

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

1.8K

A human activity recognition method based on Vision Transformer.

Huiyan Han1,2,3, Hongwei Zeng4,5,6, Liqun Kuang4,5,6

  • 1School of Computer Science and Technology, North University of China, Taiyuan, 030051, China. 20050537@nuc.edu.cn.

Scientific Reports
|July 3, 2024
PubMed
Summary
This summary is machine-generated.

This study introduces HAR-ViT, a novel method for human activity recognition using Vision Transformer (ViT) and enhanced Graph Convolutional Networks (GCNs). HAR-ViT achieves state-of-the-art performance by effectively processing spatio-temporal skeleton data.

Keywords:
Human activity recognitionSkeleton dataSpatio-temporalViT

More Related Videos

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
04:48

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

385
Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing
06:25

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Published on: February 23, 2024

577

Related Experiment Videos

Last Updated: Jun 22, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
04:23

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

1.8K
Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
04:48

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

385
Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing
06:25

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Published on: February 23, 2024

577

Area of Science:

  • Computer Vision
  • Artificial Intelligence
  • Machine Learning

Background:

  • Human activity recognition (HAR) is crucial for applications like surveillance and intelligent interaction.
  • Graph Convolutional Networks (GCNs) show promise in HAR but face challenges like over-smoothing and capturing long-range temporal dependencies.
  • Vision Transformer (ViT) has demonstrated strong performance in image-based tasks.

Purpose of the Study:

  • To propose a novel human activity recognition method, HAR-ViT, leveraging Vision Transformer (ViT) architecture.
  • To address limitations of existing GCN-based methods in capturing spatio-temporal dynamics and long-range movements.
  • To enhance the processing of 3D skeleton data for improved HAR accuracy.

Main Methods:

  • Integration of enhanced AGCL (eAGCL) within a ViT framework to process spatio-temporal skeleton data.
  • Utilizing a position encoder to manage non-sequenced information and a transformer encoder for efficient sequence data compression.
  • Employing a multi-layer perceptron (MLP) classifier for final human activity recognition.

Main Results:

  • The proposed HAR-ViT method achieves state-of-the-art (SOTA) performance.
  • Demonstrated effectiveness on three standard datasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics-Skeleton 400.
  • Successfully processes 3D skeleton data, capturing spatial features and enhancing calculation speed.

Conclusions:

  • HAR-ViT offers a powerful new approach for human activity recognition.
  • The integration of ViT with enhanced GCN components effectively addresses existing challenges in spatio-temporal data analysis.
  • The method shows significant potential for real-world applications requiring accurate and efficient activity recognition.