TSLNet: a hierarchical multi-head attention-enabled two-stream LSTM network for accurate pedestrian tracking and behavior recognition

  • 0Xiangjiaba Hydropower Plant, Yibin, China.

|

|

Summary

This summary is machine-generated.

TSLNet, a novel Hierarchical Multi-Head Attention-Enabled Two-Stream LSTM Network, enhances pedestrian tracking and behavior recognition. This advanced model excels in complex environments, improving surveillance and smart transportation systems.

Area Of Science

  • Computer Vision
  • Artificial Intelligence
  • Machine Learning

Background

  • Accurate pedestrian tracking and behavior recognition are crucial for intelligent surveillance, smart transportation, and human-computer interaction.
  • Real-world video data presents challenges like environmental variability, high-density crowds, and diverse pedestrian movements.

Purpose Of The Study

  • To introduce TSLNet, a Hierarchical Multi-Head Attention-Enabled Two-Stream LSTM Network.
  • To improve pedestrian tracking and behavior recognition in complex and dynamic environments.

Main Methods

  • TSLNet integrates a Two-Stream Convolutional Neural Network (Two-Stream CNN) with Long Short-Term Memory (LSTM) networks for spatial-temporal feature extraction.
  • A Multi-Head Attention mechanism focuses on relevant features, while Hierarchical Classifiers within a Multi-Task Learning framework enable simultaneous basic and complex behavior recognition.

Main Results

  • TSLNet significantly outperforms existing baseline models on multiple datasets.
  • Achieved higher Accuracy, Precision, Recall, F1-Score, and mAP for behavior recognition.
  • Demonstrated superior MOTA and IDF1 for pedestrian tracking.

Conclusions

  • TSLNet effectively enhances both pedestrian tracking and behavior recognition performance.
  • The proposed network is highly effective in handling complex real-world video data.
  • TSLNet shows significant potential for applications in intelligent surveillance and smart transportation.