Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Nonlinear hydrothermal associations between coupled landscape ecological risk and resilience in a major grain-producing region of China.

Journal of environmental management·2026

Same author

LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Breathing New Life into Small Object Detection with Detection-Oriented Rectification.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

A novel multi-task deep learning framework for classification and detection of intracranial structures in first-trimester fetal ultrasound images.

Physica medica : PM : an international journal devoted to the applications of physics to medicine and biology : official journal of the Italian Association of Biomedical Physics (AIFB)·2026

Same author

PathTIGR: A pathway topology-informed graph representation learning framework for immunotherapy response prediction.

Science advances·2026

Same author

Interpretable graph deep learning framework for drug synergy prediction by integrating functional and clinical similarities.

NPJ digital medicine·2026

Same journal

Style-Aware Contrastive Test-Time Adaptation: A Dual-Cache Model for Robust Vision-Language Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Semantic Frame Interpolation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Physics-Guided Cross-Modal Decoupling with Test-Time Adaptation for Hyperspectral Image Restoration.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 8, 2025

Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

Published on: January 23, 2017

Structured Attention Composition for Temporal Action Localization.

Le Yang, Junwei Han, Tao Zhao

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|June 13, 2022

Summary

This summary is machine-generated.

This study introduces a novel structured attention composition module for temporal action localization. It enhances feature learning by adaptively weighting appearance and motion, improving accuracy in untrimmed videos.

More Related Videos

Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

Published on: October 24, 2012

Investigating the Deployment of Visual Attention Before Accurate and Averaging Saccades via Eye Tracking and Assessment of Visual Sensitivity

Investigating the Deployment of Visual Attention Before Accurate and Averaging Saccades via Eye Tracking and Assessment of Visual Sensitivity

Published on: March 18, 2019

Related Experiment Videos

Last Updated: Sep 8, 2025

Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

Published on: January 23, 2017

Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

Published on: October 24, 2012

Investigating the Deployment of Visual Attention Before Accurate and Averaging Saccades via Eye Tracking and Assessment of Visual Sensitivity

Investigating the Deployment of Visual Attention Before Accurate and Averaging Saccades via Eye Tracking and Assessment of Visual Sensitivity

Published on: March 18, 2019

Area of Science:

Computer Vision
Artificial Intelligence
Machine Learning

Background:

Temporal action localization in untrimmed videos is crucial for understanding video content.
Existing methods often treat appearance and motion features equally, leading to suboptimal performance.
Different actions have varying dependencies on appearance versus motion cues.

Purpose of the Study:

To propose a novel multi-modality feature learning approach for temporal action localization.
To develop a structured attention composition module that adaptively weights appearance and motion features.
To improve the precision of localizing action instances in untrimmed videos.

Main Methods:

A novel structured attention composition module is introduced, integrating frame and modality attention.
This module learns a frame-modality structure using optimal transport theory.
It regularizes individual frame and modality attention inferences for better feature representation.

Main Results:

The proposed module consistently improves existing state-of-the-art temporal action localization methods.
New state-of-the-art performance is achieved on the THUMOS14 benchmark.
Experiments demonstrate the effectiveness of adaptive feature weighting for action localization.

Conclusions:

The structured attention composition module offers a plug-and-play solution for enhancing temporal action localization.
Adaptive weighting of appearance and motion features is key to improving model performance.
This approach advances the field by better exploiting multi-modality feature learning.