Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Masking and Demasking Agents

Masking and Demasking Agents

EDTA titrations may necessitate masking and demasking agents to temporarily protect a particular metal ion in a mixture from the EDTA reaction. These agents facilitate the sequential analysis of the metal ions by forming stable complexes with some—but not all—metal ions during certain steps.
There are many masking agents, such as cyanide, fluoride, triethanolamine, thiourea, and 2,3-bis(sulfanyl)propan-1-ol (formerly 2,3-dimercapto-1-propanol), with the masking agent chosen based on...

Deconvolution

Deconvolution

Deconvolution, also known as inverse filtering, is the process of extracting the impulse response from known input and output signals. This technique is vital in scenarios where the system's characteristics are unknown, and they must be inferred from the observable signals.
Deconvolution involves several mathematical techniques to derive the impulse response. One common approach is polynomial division. In this method, the input and output sequences are treated as coefficients of...

Force Classification

Force Classification

Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Sign Test for Matched Pairs

Sign Test for Matched Pairs

The sign test for matched pairs offers a robust method for comparing two paired samples, often for the effects of an intervention in one of them. This method is very useful in situations where the underlying distribution of the data is unknown. The test compares two related samples—often pre- and post-treatment measurements on the same subjects—to determine if there are significant differences in their median values.
To conduct the sign test, we first calculate the differences in...

Prosopagnosia

Prosopagnosia

Prosopagnosia, also known as face blindness, is the inability to recognize faces. In severe cases, individuals with prosopagnosia may not recognize close family members, including parents and spouses, by their faces. For instance, someone with prosopagnosia might walk past their child in a crowd, only realizing their mistake upon noticing their child's distinctive backpack or favorite jacket. Prosopagnosia specifically impairs facial recognition, while the recognition of other objects or...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Titanium dioxide nanoparticles relieve biochemical dysfunctions of fifth-instar larvae of silkworms following exposure to phoxim insecticide.

Chemosphere·2012

Same author

Mechanisms of prostate atrophy after LHRH antagonist cetrorelix injection: an experimental study in a rat model of benign prostatic hyperplasia.

Journal of Huazhong University of Science and Technology. Medical sciences = Hua zhong ke ji da xue xue bao. Yi xue Ying De wen ban = Huazhong keji daxue xuebao. Yixue Yingdewen ban·2012

Same author

Simulation and experimental investigation of structural dynamic frequency characteristics control.

Sensors (Basel, Switzerland)·2012

Same author

Chronic clomipramine treatment restores hippocampal expression of glial cell line-derived neurotrophic factor in a rat model of depression.

Journal of affective disorders·2012

Same author

Application of nanoLC-MS/MS to the shotgun proteomic analysis of the nematocyst proteins from jellyfish Stomolophus meleagris.

Journal of chromatography. B, Analytical technologies in the biomedical and life sciences·2012

Same author

Identification of Sare0718 as an alanine-activating adenylation domain in marine actinomycete Salinispora arenicola CNS-205.

PloS one·2012

Same journal

Style-Aware Contrastive Test-Time Adaptation: A Dual-Cache Model for Robust Vision-Language Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Semantic Frame Interpolation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Physics-Guided Cross-Modal Decoupling with Test-Time Adaptation for Hyperspectral Image Restoration.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 15, 2025

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Cross-Modal Contrastive Masked AutoEncoder for Compressed Video Pre-Training.

Bing Li, Jiaxin Chen, Guohao Li

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|July 15, 2025

Summary

This summary is machine-generated.

We introduce Cross-modal Contrastive Masked AutoEncoder (C2MAE), a novel approach for self-supervised learning (SSL) on compressed videos. C2MAE enhances representation learning by combining masked image modeling and contrastive learning for improved video understanding.

More Related Videos

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Related Experiment Videos

Last Updated: Sep 15, 2025

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Area of Science:

Computer Vision
Machine Learning
Artificial Intelligence

Background:

Self-supervised learning (SSL) is crucial for leveraging large unlabeled video datasets.
Compressed videos present unique challenges for representation learning due to data sparsity and noise.
Existing methods often struggle to effectively utilize information from different modalities within compressed video.

Purpose of the Study:

To propose a novel Transformer-based approach, Cross-modal Contrastive Masked AutoEncoder (C2MAE), for self-supervised learning on compressed videos.
To enhance representation learning by integrating masked image modeling (MIM) and contrastive learning (CL) pretext tasks.
To improve the handling of compressed video characteristics like I-frame sparsity and P-frame noise.

Main Methods:

Employed a unified Transformer encoder to process visual tokens from RGB, motion vectors, and residuals.
Developed a hybrid SSL framework combining MIM and CL, extending VideoMAE with Fine-Grained Motion-aware Masking (FGMM) and Multi-modal Reconstruction (MR).
Introduced Contrastive Cross-modal Learning (CCL) by comparing features from compressed and raw video clips.

Main Results:

C2MAE significantly enhances cross-modal interactions, effectively compensating for the limitations of compressed video data.
Achieved state-of-the-art results on UCF-101, HMDB-51, and Kinetics-400 benchmarks.
Demonstrated the effectiveness of C2MAE in delivering stronger pre-trained models for video understanding tasks.

Conclusions:

C2MAE offers a powerful and effective framework for self-supervised learning on compressed videos.
The proposed FGMM strategy and CCL module contribute to superior representation learning.
The approach successfully addresses the challenges posed by compressed video data, leading to significant performance gains.