Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Automatic Processing and Automatic Social Behavior01:28

Automatic Processing and Automatic Social Behavior

165
Automatic processing refers to the cognitive operations that occur without conscious intent or awareness, playing a fundamental role in shaping social cognition and behavior. These processes enable individuals to navigate complex social environments efficiently by relying on mental shortcuts and pre-existing knowledge structures known as schemas. One of the most influential mechanisms underlying automatic processing is priming, which subtly activates mental representations through exposure to...
165

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Physical Therapy for Sport-Related Concussion: A Network Meta-analysis and Systematic Review.

Sports health·2026
Same author

Water and Energy Turnover in Chinese Young Adults: A Doubly Labeled Water Study of Metabolic Coupling.

Nutrients·2026
Same author

MAF-Net: Multimodal cross-attention-based fusion network for cardiovascular disease classification.

PloS one·2026
Same author

Sodium butyrate promotes the function of <i>NDUFS2</i> in bovine skeletal muscle fiber type transformation and mitochondrial biosynthesis.

Frontiers in veterinary science·2026
Same author

The Influence Pathway of the Burden on Caregivers of Children With Congenital Ear Malformations: An Analysis of the Mediating Effects of Social Support and Coping Mechanisms.

The Journal of craniofacial surgery·2026
Same author

Development and validation of a risk assessment model for post-surgical scar formation in pediatric melanocytic nevus excision: a retrospective cohort study.

BMC pediatrics·2026
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Dec 23, 2025

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

5.1K

Human Interaction Recognition Based on Whole-Individual Detection.

Qing Ye1, Haoxin Zhong1, Chang Qu1

  • 1School of Information Science and Technology, North China University of Technology, Beijing 100144, China.

Sensors (Basel, Switzerland)
|April 25, 2020
PubMed
Summary
This summary is machine-generated.

This study introduces a new method to improve how computers identify human interactions in videos. By combining global scene information with individual person details, the researchers achieved 91.7% accuracy on a standard test dataset. This approach helps solve common problems like complex spatial movements and redundant video data.

Keywords:
Gaussian model downsamplinghuman interaction recognitionparallel multi-feature fusion networkwhole-individual detectiondeep learningvideo analysisfeature fusionaction classification

Frequently Asked Questions

More Related Videos

Evaluation of a Smartphone-based Human Activity Recognition System in a Daily Living Environment
06:49

Evaluation of a Smartphone-based Human Activity Recognition System in a Daily Living Environment

Published on: December 11, 2015

9.2K
Capturing Dynamic Finger Gesturing with High-resolution Surface Electromyography and Computer Vision
08:15

Capturing Dynamic Finger Gesturing with High-resolution Surface Electromyography and Computer Vision

Published on: March 28, 2025

1.1K

Related Experiment Videos

Last Updated: Dec 23, 2025

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

5.1K
Evaluation of a Smartphone-based Human Activity Recognition System in a Daily Living Environment
06:49

Evaluation of a Smartphone-based Human Activity Recognition System in a Daily Living Environment

Published on: December 11, 2015

9.2K
Capturing Dynamic Finger Gesturing with High-resolution Surface Electromyography and Computer Vision
08:15

Capturing Dynamic Finger Gesturing with High-resolution Surface Electromyography and Computer Vision

Published on: March 28, 2025

1.1K

Area of Science:

  • Computer vision and human interaction recognition research
  • Artificial intelligence and machine learning applications

Background:

No prior work had resolved the persistent challenges associated with identifying complex social behaviors within digital video footage. Current computational systems often struggle to interpret the spatial intricacies inherent in multi-person scenarios. Researchers frequently face obstacles when attempting to distinguish subtle action characteristics across varying temporal intervals. That uncertainty drove the need for more robust analytical frameworks capable of processing interactive motion. Existing models often suffer from performance degradation as architectural depth increases during the training process. Furthermore, excessive redundant data within video files frequently obscures critical information needed for precise classification. This gap motivated the development of advanced algorithms that can effectively isolate and synthesize relevant behavioral cues. Prior research has shown that standard approaches often fail to capture the full spectrum of individual and collective movement patterns simultaneously.

Purpose Of The Study:

This study aims to improve the accuracy of identifying social behaviors in digital video by addressing spatial and temporal complexities. The researchers seek to overcome limitations in current recognition systems that struggle with redundant data and complex action features. They intend to investigate how different time periods influence the characteristics of interactive movements. The team proposes an improved fusion time-phase feature of the Gaussian model to isolate critical video keyframes. Furthermore, they aim to develop a multi-feature fusion network that utilizes parallel Inception and ResNet architectures. This effort is motivated by the need to reduce network parameter quantities while simultaneously enhancing overall model performance. The authors also seek to address spatial complexity by combining global scene information with individual detail features. This work is driven by the goal of making full use of available feature information to advance the field of automated behavioral analysis.

Main Methods:

The research team employed a multi-feature fusion network algorithm to process complex interactive action sequences. They utilized a parallel architecture combining Inception and ResNet modules to optimize performance and reduce parameter counts. To handle temporal variations, the investigators implemented an improved fusion time-phase feature of the Gaussian model. This approach facilitated the extraction of video keyframes while simultaneously discarding large amounts of extraneous data. The study design focused on integrating global scene features with specific individual detail features throughout the analysis. Researchers performed evaluations using the UT-interaction dataset to test the robustness of their proposed classification framework. This methodology prioritized the synthesis of distinct feature streams to address spatial complexity in multi-person scenarios. The experimental approach ensured that both collective and personal behavioral information contributed to the final recognition output.

Main Results:

The proposed algorithm achieved a classification accuracy of 91.7% on the UT-interaction dataset. This result demonstrates the effectiveness of integrating global scene features with individual detail features for behavioral analysis. The parallel Inception and ResNet architecture successfully reduced the total network parameter quantity compared to standard models. By utilizing the Gaussian-based temporal model, the system effectively mitigated the influence of redundant information within the video files. The study showed that this dual-feature fusion approach alleviates network degradation typically caused by increasing architectural depth. Researchers observed that the combined model captured complex interactive action features more reliably than single-stream methods. The experimental data confirmed that the proposed method handles spatial complexity by leveraging information from both sides of an action. These findings highlight the performance gains achieved through the strategic combination of whole-individual detection techniques.

Conclusions:

The authors propose that integrating global and individual video streams enhances the precision of behavioral classification tasks. This synthesis suggests that capturing both scene-wide context and personal detail optimizes the extraction of relevant information. The researchers claim their parallel network architecture successfully mitigates issues related to parameter inflation and model degradation. Their findings indicate that utilizing Gaussian-based temporal modeling effectively filters out unnecessary noise from video sequences. The study concludes that this dual-feature strategy provides a viable path for overcoming spatial complexity in automated recognition. These results imply that focusing on whole-individual detection improves the reliability of systems analyzing multi-person dynamics. The team asserts that their approach achieves superior classification outcomes compared to traditional methods lacking this integrated perspective. Ultimately, the evidence supports the utility of combining distinct feature sets to advance the state of the art in this domain.

The researchers propose a multi-feature fusion network that integrates global scene data with individual detail features. This approach utilizes a parallel Inception and ResNet architecture to process video inputs, achieving a 91.7% accuracy rate on the UT-interaction dataset.

The authors employ an improved fusion time-phase feature of the Gaussian model to identify keyframes. This specific tool allows the system to discard redundant information, which helps the algorithm focus on the most relevant temporal segments of the video.

The researchers state that the whole video provides global features of both participants, while individual videos capture specific detail features of a single person. This spatial distinction is necessary to address the complexity of multi-person interactions.

The whole video acts as a global context provider, while individual videos supply granular behavioral data. By combining these two data types, the model makes full use of the available information to improve classification performance.

The researchers measured the performance of their algorithm using the UT-interaction dataset. They reported that their proposed method achieved a classification accuracy of 91.7% by effectively fusing these distinct feature sets.

The authors suggest that their contribution to the field lies in the full utilization of feature information from both whole and individual perspectives. They propose that this strategy effectively alleviates network degradation while enhancing overall classification precision.