Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Fertility Alteration Characteristics and Cytological Mechanisms of Pollen Abortion in Thermo-Photo-Sensitive Genic Male Sterile Wheat K64S.

Plants (Basel, Switzerland)·2026
Same author

Dermal fibroblasts attenuate osteoarthritis by restoring synovial fibroblast homeostasis.

Journal of orthopaedic translation·2026
Same author

Optical metasurfaces for general vision processing on the edge.

Nature·2026
Same author

Leveraging natural climatic advantages for large‑scale wheat doubled haploid production via wheat × maize: a protocol optimization study.

BMC plant biology·2026
Same author

MADCrowner: Margin Aware Dental Crown design with template deformation and refinement.

Medical image analysis·2026
Same author

Unveiling the Th17/Treg imbalance: a key player in <i>Clostridioides difficile</i>-induced infection.

Gut microbes·2026
Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: Oct 6, 2025

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
08:25

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

9.1K

Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization.

Linjiang Huang, Liang Wang, Hongsheng Li

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |January 20, 2022
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a novel Multi-Modality Self-Distillation (MMSD) framework to improve weakly-supervised temporal action localization (WTAL) by effectively using RGB and optical flow data. The MMSD framework enhances action boundary detection and reduces label noise for better video understanding.

    More Related Videos

    Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
    06:37

    Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

    Published on: December 15, 2023

    4.3K
    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
    03:31

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

    Published on: December 15, 2023

    674

    Related Experiment Videos

    Last Updated: Oct 6, 2025

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment
    08:25

    Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

    Published on: May 7, 2019

    9.1K
    Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
    06:37

    Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

    Published on: December 15, 2023

    4.3K
    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
    03:31

    Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

    Published on: December 15, 2023

    674

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • Machine Learning

    Background:

    • Weakly-supervised Temporal Action Localization (WTAL) is crucial for high-level video understanding but struggles with precise action boundary detection due to whole-video classification labels.
    • Existing pseudo-label methods for WTAL often fail to fully leverage multi-modal data (RGB and optical flow) and effectively mitigate label noise, limiting feature representation learning.

    Purpose of the Study:

    • To propose a novel Multi-Modality Self-Distillation (MMSD) framework to address the limitations of current WTAL methods.
    • To enhance the accuracy of action instance boundary detection in videos using multi-modal information.
    • To improve the robustness of WTAL models by mitigating label noise through a self-voting mechanism.

    Main Methods:

    • The proposed MMSD framework utilizes two single-modal streams (RGB and optical flow) and a fused-modal stream.
    • It incorporates multi-modality knowledge distillation to transfer information between streams, enhancing snippet-level classification.
    • Multi-modality self-voting is employed to reduce label noise by considering the reliability and complementarity of different modalities.

    Main Results:

    • The MMSD framework demonstrated significant improvements in weakly-supervised temporal action localization.
    • Experimental results on THUMOS14 and ActivityNet1.3 datasets show superior performance compared to existing state-of-the-art approaches.
    • The method effectively mitigates label noise and enhances discriminative feature representation learning.

    Conclusions:

    • The Multi-Modality Self-Distillation (MMSD) framework offers an effective solution for WTAL by leveraging multi-modal data and robust label noise reduction.
    • The proposed approach achieves state-of-the-art performance, highlighting the benefits of knowledge distillation and self-voting in multi-modal settings.
    • The code is publicly available, facilitating further research and development in temporal action localization.