Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Chunking and Rehearsal in Sensory Memory

Chunking and Rehearsal in Sensory Memory

Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Long-Term Memory

Long-Term Memory

Long-term memory is a relatively permanent type of memory, capable of storing vast amounts of information over extended periods. Its storage capacity is generally considered unlimited.
Long-term memory can be categorized into two primary types: explicit and implicit memory. Explicit memory, also known as declarative memory, involves the conscious recollection of information that we deliberately try to remember, recall, and articulate. This type of memory encompasses specific facts, events, and...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Hypogin-derived N-cinnamoylated peptide as a promising green fungicide against peanut southern blight caused by Sclerotium rolfsii.

Pest management science·2026

Same author

Copper-mediated oxidative deconstruction of polyethylene terephthalate <i>via</i> photoinduced ligand-to-metal charge transfer.

Organic & biomolecular chemistry·2026

Same author

Multiscale structural engineering enables superior energy storage in tetragonal tungsten bronze relaxor ferroelectrics.

Nature communications·2026

Same author

Extracellular polymeric substances regulate depth-and-season-dependent soil organic carbon stabilization under prescribed fire in karst soils.

Journal of environmental management·2026

Same author

Oral Chitosan-Tripolyphosphate Nanoparticles Enhance the Metabolic Regulatory Effects of Snow Lotus Polysaccharide in Type 2 Diabetes.

Pharmaceutics·2026

Same author

Current status of proton pump inhibitor usage in patients with acute coronary syndrome and atrial fibrillation: a cross-sectional study.

Frontiers in cardiovascular medicine·2026

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 8, 2026

Visualizing Visual Adaptation

Visualizing Visual Adaptation

Published on: April 24, 2017

Mettle: Meta-Token Learning for Memory-Efficient Audio-Visual Adaptation.

Jinxing Zhou, Zhihui Li, Yongqiang Yu

IEEE Transactions on Pattern Analysis and Machine Intelligence

|December 11, 2025

Summary

This summary is machine-generated.

Meta-Token Learning (Mettle) offers a memory-efficient way to adapt large audio-visual models. This method reduces training memory and time for researchers with limited computational resources, enabling wider accessibility.

Related Experiment Videos

Last Updated: Jan 8, 2026

Visualizing Visual Adaptation

Visualizing Visual Adaptation

Published on: April 24, 2017

Area of Science:

Artificial Intelligence
Computer Vision
Machine Learning

Background:

Current audio-visual learning research prioritizes task-specific models using complex multimodal fusion.
Recent advancements focus on universal audio-visual embedding networks for diverse downstream tasks.
Parameter-efficient fine-tuning of large pretrained transformers is common but memory-intensive due to deep backbones.

Purpose of the Study:

To introduce Meta-Token Learning (Mettle), a novel, memory-efficient method for adapting large pretrained transformer models to audio-visual tasks.
To address the high training memory consumption of existing parameter-efficient fine-tuning techniques.
To enhance accessibility for researchers with constrained computational resources.

Main Methods:

Mettle employs a lightweight Layer-Centric Distillation (LCD) module to distill intact audio/visual features from each transformer layer into compact meta-tokens.
The distillation process balances pretrained knowledge preservation with task-specific adaptation.
A Meta-Token Injection (MTI) module is introduced for fine-grained segmentation tasks, guiding earlier layer adaptation using top-layer distilled meta-tokens.

Main Results:

Mettle significantly reduces memory usage and training time compared to existing methods.
The approach maintains parameter efficiency.
Competitive accuracy is achieved across various audio-visual benchmarks, including event localization, video parsing, and segmentation tasks.

Conclusions:

Mettle provides a simple and memory-efficient solution for adapting large-scale pretrained transformer models for audio-visual tasks.
The method democratizes access to advanced audio-visual learning by lowering computational barriers.
Mettle demonstrates strong performance across diverse audio-visual benchmarks, highlighting its versatility and effectiveness.