Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Oct 24, 2025

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
12:39

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

7.9K

Target Adaptive Tracking Based on GOTURN Algorithm with Convolutional Neural Network and Data Fusion.

Zhengze Li1,2, Jiancheng Xu1

  • 1School of Electronics and Information, Northwestern Polytechnic University, Xi'an, Shaanxi 710016, China.

Computational Intelligence and Neuroscience
|August 16, 2021
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

CTSL loss leads to anti-PD-1 immunotherapy resistance in lung cancer by suppressing the anti-tumor function of peripheral CD8<sup>+</sup> T cells.

Frontiers in immunology·2026
Same author

Integrated model based on ultrasound attenuation and metabolic biomarkers for noninvasive assessment of hepatic fat fraction categories in MASLD: a QCT-referenced study.

Frontiers in physiology·2026
Same author

Curvature-Induced Giant Second-Harmonic Generation in WS<sub>2</sub> Nanoscroll on a Metallic Film.

ACS nano·2026
Same author

Sequential nanocatalytic therapy and lysosomal dysfunction for overcoming castration-resistant prostate cancer.

Bioactive materials·2026
Same author

Unusual susceptibility to vancomycin in <i>Elizabethkingia meningoseptica</i>: mechanisms and clinical implications.

Frontiers in microbiology·2026
Same author

Genomic characterization of vancomycin-resistant <i>Enterococcus faecium</i> and van-carrying mobile genetic elements in a tertiary hospital in northeastern China.

Frontiers in microbiology·2026
Same journal

RETRACTION: Multidimensional Heterogeneous Network Link Adaptation Based on Mobile Environment.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Framework to Segment and Evaluate Multiple Sclerosis Lesion in MRI Slices Using VGG-UNet.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Facial Emotion Recognition Using a Novel Fusion of Convolutional Neural Network and Local Binary Pattern in Crime Investigation.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Automatic Intelligent System Using Medical of Things for Multiple Sclerosis Detection.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: Intangible Cultural Heritage Reproduction and Revitalization: Value Feedback, Practice, and Exploration Based on the IPA Model.

Computational intelligence and neuroscience·2026
Same journal

RETRACTION: CNN Based Multiclass Brain Tumor Detection Using Medical Imaging.

Computational intelligence and neuroscience·2025
See all related articles

This paper introduces an improved tracking method for identifying objects in video. By enhancing the existing GOTURN algorithm with new attention mechanisms and spatial-temporal data, the researchers achieved higher accuracy and stability in tracking moving targets across frames.

Area of Science:

  • Computer vision and machine learning within artificial intelligence
  • Generic Object Tracking Using Regression Network (GOTURN) optimization research

Background:

Current visual tracking systems often struggle with maintaining precision when objects move through complex environments. Researchers frequently observe significant performance drops when lighting conditions change or targets become partially obscured. That uncertainty drove the need for more robust computational architectures in modern tracking frameworks. Prior research has shown that standard regression networks often fail to capture fine-grained spatial details effectively. No prior work had resolved the limitations of basic feature extraction in real-time object localization tasks. This gap motivated the development of more sophisticated neural network configurations for tracking applications. Scientists have long sought to bridge the divide between high-speed processing and reliable target identification. The field remains challenged by the inherent difficulty of distinguishing targets from cluttered backgrounds during rapid motion.

Purpose Of The Study:

Keywords:
computer visionneural networksautonomous drivingfeature extractiondata fusion

Frequently Asked Questions

The researchers propose a dual-improvement strategy: integrating a residual attention mechanism to refine feature expression and utilizing spatiotemporal context fusion to enhance localization. This combination allows the network to better distinguish targets from backgrounds compared to the original, less robust regression-based architecture.

The authors utilize a convolutional neural network as the foundational structure. They specifically incorporate a residual attention mechanism within the target template network to boost the system's ability to identify and represent relevant visual information during the tracking process.

The researchers indicate that transmitting the target template, prediction area, and search area simultaneously is necessary to extract a comprehensive general feature map. This parallel input strategy enables the fully connected layer to accurately predict the target location in the current frame.

More Related Videos

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

710

Related Experiment Videos

Last Updated: Oct 24, 2025

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
12:39

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

7.9K
Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

710

The primary aim of this study is to enhance the tracking accuracy and robustness of the Generic Object Tracking Using Regression Network algorithm. The researchers seek to address the performance limitations inherent in the original model when applied to complex visual tasks. This investigation focuses on overcoming the challenges of low precision in dynamic environments. The authors propose integrating a residual attention mechanism to improve how the network processes target features. They also intend to utilize spatiotemporal context information to refine the data fusion process within the tracking pipeline. This work is motivated by the increasing demand for reliable tracking in autonomous driving and intelligent monitoring systems. The team explores whether these specific architectural modifications can lead to superior performance compared to existing methods. By optimizing the network structure, the study aims to provide a more effective solution for real-time target localization.

Main Methods:

The research team employed a computational design approach to modify the existing regression network architecture. They utilized a convolutional neural network as the primary framework for processing visual input sequences. The review approach involved integrating a residual attention mechanism directly into the target template branch of the network. To facilitate data fusion, the investigators implemented a module that combines spatiotemporal context information from consecutive video frames. The team fed target templates, prediction regions, and search areas into the network in a synchronized manner. They relied on a fully connected layer to output the final coordinates of the tracked object. The experimental validation phase utilized standardized, mainstream data sets commonly used in the computer vision community. This systematic evaluation allowed for a direct comparison between the original algorithm and the newly developed, enhanced version.

Main Results:

The proposed algorithm exhibits a significant improvement in overall tracking performance compared to the baseline regression network. Quantitative analysis across mainstream test data sets confirms that the integration of attention mechanisms enhances feature expression. The system successfully predicts target locations with greater precision by leveraging spatiotemporal context information during the fusion process. The experimental data indicate that the modified network structure effectively mitigates the robustness issues found in the original model. By refining the feature map extraction, the algorithm maintains higher accuracy during complex tracking scenarios. The findings demonstrate that the combination of residual attention and context fusion yields superior results over standard methods. The researchers report that these architectural adjustments lead to more stable tracking outcomes in diverse environmental conditions. The comparative metrics highlight a clear performance gain achieved through the proposed algorithmic enhancements.

Conclusions:

The authors suggest that their modified architecture successfully addresses the precision deficits observed in standard regression-based trackers. This synthesis indicates that incorporating residual attention mechanisms significantly bolsters the network's ability to represent target features. The results imply that integrating spatiotemporal context information provides a more stable foundation for continuous object localization. These findings confirm that the proposed enhancements lead to a measurable increase in overall tracking performance compared to baseline models. The researchers conclude that their approach offers a viable path for improving robustness in challenging visual scenarios. This work highlights the potential for combining attention-based feature refinement with multi-source data fusion. The evidence points toward a clear advantage in using these specific architectural modifications for modern tracking tasks. The study confirms that such advancements contribute to more reliable outcomes in automated monitoring and navigation systems.

The authors employ spatiotemporal context information to facilitate data fusion. This approach allows the system to leverage both the appearance of the target and its movement patterns across frames, resulting in more reliable tracking than relying solely on static template matching.

The researchers measure performance by evaluating the algorithm against current mainstream target-tracking test data sets. They report that their proposed method demonstrates a significant improvement in overall tracking performance when compared to the original, unmodified version of the regression network.

The authors propose that their enhanced tracking method has practical utility in fields such as human-computer interaction, intelligent monitoring, and autonomous driving. They suggest that these improvements are vital for the continued development of reliable tracking technologies in the era of artificial intelligence.