Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Auditory Pathway

Auditory Pathway

Auditory pathways constitute the complex neural circuits responsible for transmitting and interpreting auditory information from the peripheral auditory system to the brain. Sound waves are initially captured by the outer ear, funneled through the ear canal, and reach the tympanic membrane (eardrum). These vibrations are transmitted via the middle ear's ossicles to the inner ear's cochlea.
When viewed cross-sectionally, the cochlea reveals the scala vestibuli and scala tympani flanking the...

Chunking and Rehearsal in Sensory Memory

Chunking and Rehearsal in Sensory Memory

Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of information more...

Auditory Perception

Auditory Perception

The auditory system is essential for sound perception, utilizing various critical structures. When sound waves enter the outer ear, they travel through the ear canal and cause the eardrum to vibrate. These vibrations are then transmitted to the middle ear, where three tiny bones – the malleus, incus, and stapes – amplify the sound. This amplification is crucial, as it ensures that the sound vibrations are strong enough to be conveyed to the inner ear. These vibrations then reach the cochlea, a...

Perceiving Loudness, Pitch, and Location

Perceiving Loudness, Pitch, and Location

The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by identifying...

Perception of Sound Waves

Perception of Sound Waves

The human ear is not equally sensitive to all frequencies in the audible range. It may perceive sound waves with the same pressure but different frequencies as having different loudness. Moreover, the perception of sound waves depends on the health of an individual's ears, which decays with age. The health of one's ears may also be affected by regular exposure to loud noises.
The pitch of a sound depends on the frequency and the pressure amplitude of the source. Two sounds of the same frequency...

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Parallax error indicates simple cue-anchoring in the head-direction system.

bioRxiv : the preprint server for biology·2026

Same author

Deciphering hippocampal place codes in weak theta rhythms.

Nature communications·2026

Same author

Conditional deep learning model reveals translation elongation determinants during amino acid deprivation.

Communications biology·2025

Same author

Binding in hippocampal-entorhinal circuits enables compositionality in cognitive maps.

Advances in neural information processing systems·2025

Same author

MARBLE: interpretable representations of neural population dynamics using geometric deep learning.

Nature methods·2025

Same author

Principled neuromorphic reservoir computing.

Nature communications·2025

Same journal

Universal perceptron and DNA-like learning algorithm for binary neural networks: LSBF and PBF implementations.

IEEE transactions on neural networks·2013

Same journal

Guest editorial: special section on white box nonlinear prediction models.

IEEE transactions on neural networks·2011

Same journal

Data-based fault-tolerant control of high-speed trains with traction/braking notch nonlinearities and actuator failures.

IEEE transactions on neural networks·2011

Same journal

Guest editorial: special section on data-based control, modeling, and optimization.

IEEE transactions on neural networks·2011

Same journal

Neural network-based multiple robot simultaneous localization and mapping.

IEEE transactions on neural networks·2011

Same journal

Data-driven model-free adaptive control for a class of MIMO nonlinear discrete-time systems.

IEEE transactions on neural networks·2011

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 18, 2026

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Learning bimodal structure in audio-visual data.

Gianluca Monaci¹, Pierre Vandergheynst, Friedrich T Sommer

¹Redwood Center for Theoretical Neuroscience, University of California, Berkeley, CA 94720-3190 USA. gianluca.monaci@philips.com

IEEE Transactions on Neural Networks

|December 8, 2009

Summary

This summary is machine-generated.

This study introduces a new unsupervised learning model for audio-visual signals. The model effectively identifies sound sources in videos, even with significant background noise and visual distractions.

Related Experiment Videos

Last Updated: Jun 18, 2026

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Area of Science:

Computer Vision
Machine Learning
Signal Processing

Background:

Audio-visual signals contain rich, complementary information.
Existing methods struggle to effectively model complex audio-visual structures.
Unsupervised learning offers a promising avenue for discovering inherent data patterns.

Purpose of the Study:

To develop a novel unsupervised model for learning bimodally informative structures from audio-visual signals.
To represent audio-visual signals as sparse sums of learned audio-visual kernels.
To demonstrate the model's capability in sound source localization.

Main Methods:

A sparse representation of audio-visual signals using bimodal kernels (audio waveform snippets and spatio-temporal visual basis functions).
Unsupervised learning to form dictionaries of these bimodal kernels from data.
Independent and arbitrary positioning of kernels in space and time for signal representation.

Main Results:

Learned dictionaries capture salient audio-visual data structures.
The model successfully localizes sound sources in video frames.
Robust speaker localization achieved even with acoustic and visual distracters in two-speaker scenarios.

Conclusions:

The proposed model effectively learns meaningful audio-visual structures.
The learned dictionary facilitates accurate sound source localization.
This approach demonstrates robustness in complex, real-world scenarios.