Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Asperosaponin VI Ameliorates Spontaneous Abortion by Inhibiting Trophoblast Ferroptosis via the KEAP1/NRF2/GPX4 Axis.

Antioxidants (Basel, Switzerland)·2026

Same author

Cultural factors influencing the sexual health concepts of Chinese older adults and implications for nursing work: a scoping review.

BMC geriatrics·2026

Same author

Electron cloud modulation in lignin-black phosphorus by alternating electromagnetic fields for enhanced hydrogen peroxide photosynthesis.

Journal of colloid and interface science·2026

Same author

LC3-linked Golgi-bypass exocytosis fortifies epithelial barrier defense.

Autophagy·2026

Same author

Silent evolution: β-lactam resistance acquisition in a virulent Salmonella enterica paratyphi B during systemic therapy despite clinical cure; A case report.

International journal of infectious diseases : IJID : official publication of the International Society for Infectious Diseases·2026

Same author

Retraction notice to "Natural flavonoids alleviate glioblastoma multiforme by regulating long non-coding RNA" [Biomedicine & Pharmacotherapy 161 (2023) 114477].

Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie·2026

Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 1, 2025

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Unsupervised Modality-Transferable Video Highlight Detection With Representation Activation Sequence Learning.

Tingtian Li, Zixun Sun, Xinyu Xiao

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|March 7, 2024

Summary

This summary is machine-generated.

This study introduces a new unsupervised highlight detection model using cross-modal perception. It effectively identifies key moments in videos without manual labeling, improving editing efficiency for online content.

More Related Videos

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

Related Experiment Videos

Last Updated: Jul 1, 2025

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Combining Eye-tracking Data with an Analysis of Video Content from Free-viewing a Video of a Walk in an Urban Park Environment

Published on: May 7, 2019

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

Area of Science:

Computer Science
Artificial Intelligence
Machine Learning

Background:

Manual video labeling is time-consuming and limits supervised learning for new video categories.
Many videos lack audio, hindering multimodal approaches for highlight detection.
Efficient highlight detection is vital for video editing on internet platforms.

Purpose of the Study:

To develop a novel unsupervised model for highlight detection using cross-modal perception.
To address limitations of supervised methods and audio absence in highlight detection.
To improve the efficiency of video editing by automating highlight identification.

Main Methods:

A novel model with cross-modal perception for unsupervised highlight detection.
Learning visual-audio semantics via self-reconstruction with image-audio pairs.
Representation Activation Sequence Learning (RASL) module with k-point contrastive learning for significant activations.
Symmetric Contrastive Learning (SCL) module for paired visual-audio representation learning.
Auxiliary masked Feature Vector Sequence (FVS) reconstruction for representation enhancement.

Main Results:

The proposed model achieves superior performance in unsupervised highlight detection compared to state-of-the-art methods.
The model effectively generates representations with paired visual-audio semantics from visual modality alone during inference.
RASL module successfully outputs highlight scores for video segments.

Conclusions:

The developed unsupervised cross-modal highlight detection framework offers a significant advancement.
The model overcomes challenges posed by manual labeling and audio absence.
This approach enhances video editing efficiency by automating the identification of key moments.