Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Effect of Mechanical Polishing on Rice Flavor: Comparison and Exploration of Key Aroma Characteristics Components.

Foods (Basel, Switzerland)·2026
Same author

Combined inhibition of BETs and HDACs as a potential epigenetics-based therapy for malignant rhabdoid tumor.

Cell death & disease·2026
Same author

Arginine metabolism and the NF-ĸB pathway jointly regulate the airway inflammation in asthma mediated by ILC2s.

International immunopharmacology·2026
Same author

Debranching and OSA esterification of waxy maize starch: effects on nanoparticle properties and emulsion performance.

Food chemistry: X·2026
Same author

A Synthetic Data-Augmented Deep Learning Framework for Robust Segmentation and Quantification of the Carotid Artery in Ultrasound Images.

Ultrasound in medicine & biology·2026
Same author

CENPA as a Genome Stability-Associated Biomarker in Hepatocellular Carcinoma: Multiomics Analysis and Experimental Validation.

Human mutation·2026
Same journal

Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Learning Gaze Synthesizer via 3D-eye Controlled Diffusion and Cross-domain Feature Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Underlying Semantic Diffusion for Effective and Efficient In-Context Learning.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

DiffRES: Unleashing Text-to-Image Diffusion Models for Generative Referring Expression Segmentation without Information Leakage.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Location Matters: Frequency-Spatial Dual Space Adaptation for Cross-Domain Few-Shot Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BayeTopo: Bayesian-based Topology-guided Learning for Vascular Imaging Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: Aug 3, 2025

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.6K

StochasticFormer: Stochastic Modeling for Weakly Supervised Temporal Action Localization.

Haichao Shi, Xiao-Yu Zhang, Changsheng Li

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |April 7, 2023
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces StochasticFormer, a novel framework for weakly supervised temporal action localization (WS-TAL) that addresses under- and over-localization issues. By modeling finer-grained interactions, it achieves more accurate action identification in videos.

    More Related Videos

    Temporal Ordering of Dynamic Expression Data from Detailed Spatial Expression Maps
    11:52

    Temporal Ordering of Dynamic Expression Data from Detailed Spatial Expression Maps

    Published on: February 9, 2017

    6.0K
    Corticospinal Excitability Modulation During Action Observation
    12:33

    Corticospinal Excitability Modulation During Action Observation

    Published on: December 31, 2013

    9.0K

    Related Experiment Videos

    Last Updated: Aug 3, 2025

    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
    05:48

    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

    Published on: August 9, 2024

    1.6K
    Temporal Ordering of Dynamic Expression Data from Detailed Spatial Expression Maps
    11:52

    Temporal Ordering of Dynamic Expression Data from Detailed Spatial Expression Maps

    Published on: February 9, 2017

    6.0K
    Corticospinal Excitability Modulation During Action Observation
    12:33

    Corticospinal Excitability Modulation During Action Observation

    Published on: December 31, 2013

    9.0K

    Area of Science:

    • Computer Vision
    • Machine Learning
    • Artificial Intelligence

    Background:

    • Weakly supervised temporal action localization (WS-TAL) identifies action time intervals using video-level labels.
    • Existing WS-TAL methods struggle with under- and over-localization, degrading performance.

    Purpose of the Study:

    • To propose StochasticFormer, a transformer-based framework for refined temporal action localization.
    • To investigate finer-grained interactions among intermediate predictions for improved accuracy.

    Main Methods:

    • Developed a transformer-structured stochastic process modeling framework (StochasticFormer).
    • Utilized a pseudo localization module to generate variable-length pseudo action instances.
    • Employed an encoder-decoder network with deterministic and latent paths for information integration.
    • Optimized the framework using video-level classification, frame-level semantic coherence, and ELBO losses.

    Main Results:

    • StochasticFormer effectively addresses under- and over-localization challenges in WS-TAL.
    • Demonstrated superior performance compared to state-of-the-art methods on THUMOS14 and ActivityNet1.2 benchmarks.
    • Achieved further refined localization by investigating finer-grained interactions.

    Conclusions:

    • StochasticFormer offers a robust solution for weakly supervised temporal action localization.
    • The proposed framework significantly enhances localization accuracy in untrimmed videos.
    • The approach shows strong potential for advancing video understanding research.