Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Neurohormonal-Immune Dysregulation in Rosacea: Emerging Perspectives from the Skin-Gut-Brain Axis.

Drug design, development and therapy·2026

Same author

Acteoside inhibits the progression of squamous cell carcinoma by regulating the TWEAK/cIAP1/NF-κB signaling axis and secretion of IL-6, IL-8 and VEGF.

International immunopharmacology·2026

Same author

A Review of Aryl Hydrocarbon Receptor-Mediated Immune Regulation in Cutaneous Squamous Cell Carcinoma Progression.

ImmunoTargets and therapy·2026

Same author

Trends in diet quality and associated comprehensive environmental impacts in the United States, 2001 to 2018: a serial cross-sectional study.

The American journal of clinical nutrition·2026

Same author

Recent Advances in Novel Drug Delivery Systems for the Management of Cutaneous Squamous Cell Carcinoma.

International journal of nanomedicine·2026

Same author

Influenza forecasting method based on dual-chan nel feature fusion of VMD decomposition.

Scientific reports·2026

Same journal

Turbulent flow in a vortex separator with a directed pipe inlet.

Scientific reports·2026

Same journal

Systematic characteristic evaluation of clay-based cementitious material derived from calcium carbide residue and waste tile powder.

Scientific reports·2026

Same journal

Retraction Note: Improvement of a rapid diagnostic application of monoclonal antibodies against avian influenza H7 subtype virus using Europium nanoparticles.

Scientific reports·2026

Same journal

Applying large language models to spam detection in the Kazakh low-resource language setting.

Scientific reports·2026

Same journal

An open-source 3D printing system enabling in-situ freeze-thaw processing of hydrogels.

Scientific reports·2026

Same journal

An enhanced EfficientNet framework for automated waste classification using cosine annealing and label smoothing.

Scientific reports·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 22, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

A human activity recognition method based on Vision Transformer.

Huiyan Han^1,2,3, Hongwei Zeng^4,5,6, Liqun Kuang^4,5,6

¹School of Computer Science and Technology, North University of China, Taiyuan, 030051, China. 20050537@nuc.edu.cn.

Scientific Reports

|July 3, 2024

Summary

This summary is machine-generated.

This study introduces HAR-ViT, a novel method for human activity recognition using Vision Transformer (ViT) and enhanced Graph Convolutional Networks (GCNs). HAR-ViT achieves state-of-the-art performance by effectively processing spatio-temporal skeleton data.

Keywords:

Human activity recognition Skeleton data Spatio-temporal ViT

More Related Videos

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Published on: February 23, 2024

Related Experiment Videos

Last Updated: Jun 22, 2025

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

Published on: April 21, 2023

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Author Spotlight: Assessment of Visual Acuity in Central Vision Loss Through Motion-Based Peripheral Vision Testing

Published on: February 23, 2024

Area of Science:

Computer Vision
Artificial Intelligence
Machine Learning

Background:

Human activity recognition (HAR) is crucial for applications like surveillance and intelligent interaction.
Graph Convolutional Networks (GCNs) show promise in HAR but face challenges like over-smoothing and capturing long-range temporal dependencies.
Vision Transformer (ViT) has demonstrated strong performance in image-based tasks.

Purpose of the Study:

To propose a novel human activity recognition method, HAR-ViT, leveraging Vision Transformer (ViT) architecture.
To address limitations of existing GCN-based methods in capturing spatio-temporal dynamics and long-range movements.
To enhance the processing of 3D skeleton data for improved HAR accuracy.

Main Methods:

Integration of enhanced AGCL (eAGCL) within a ViT framework to process spatio-temporal skeleton data.
Utilizing a position encoder to manage non-sequenced information and a transformer encoder for efficient sequence data compression.
Employing a multi-layer perceptron (MLP) classifier for final human activity recognition.

Main Results:

The proposed HAR-ViT method achieves state-of-the-art (SOTA) performance.
Demonstrated effectiveness on three standard datasets: NTU RGB+D 60, NTU RGB+D 120, and Kinetics-Skeleton 400.
Successfully processes 3D skeleton data, capturing spatial features and enhancing calculation speed.

Conclusions:

HAR-ViT offers a powerful new approach for human activity recognition.
The integration of ViT with enhanced GCN components effectively addresses existing challenges in spatio-temporal data analysis.
The method shows significant potential for real-world applications requiring accurate and efficient activity recognition.