Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Perceiving Loudness, Pitch, and Location01:21

Perceiving Loudness, Pitch, and Location

548
The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...
548

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Body burden and health risk of pharmaceuticals in elderly population: A multi-site biomonitoring study in China.

Ecotoxicology and environmental safety·2025
Same author

GLCONet: Learning Multisource Perception Representation for Camouflaged Object Detection.

IEEE transactions on neural networks and learning systems·2024
Same author

Robust Audio-Visual Contrastive Learning for Proposal-Based Self-Supervised Sound Source Localization in Videos.

IEEE transactions on pattern analysis and machine intelligence·2024
Same author

Stress and strain analysis of contractions during ramp distension in partially obstructed guinea pig jejunal segments.

Journal of biomechanics·2011
Same author

Transarticular screw and C1 hook fixation for os odontoideum with atlantoaxial dislocation.

World neurosurgery·2011
Same author

Surgical treatments of myelopathy caused by cervical ligamentum flavum ossification.

World neurosurgery·2011
Same journal

Style-Aware Contrastive Test-Time Adaptation: A Dual-Cache Model for Robust Vision-Language Alignment.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Semantic Frame Interpolation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Physics-Guided Cross-Modal Decoupling with Test-Time Adaptation for Hyperspectral Image Restoration.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: Oct 21, 2025

Cross-Modal Multivariate Pattern Analysis
13:51

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

20.1K

Discriminative Cross-Modality Attention Network for Temporal Inconsistent Audio-Visual Event Localization.

Hanyu Xuan, Lei Luo, Zhenyu Zhang

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |September 3, 2021
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a novel network for audio-visual event localization, effectively handling temporal inconsistencies by adaptively filtering information. The approach enhances multi-modality perception for more accurate event identification.

    More Related Videos

    Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example
    08:45

    Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

    Published on: October 24, 2012

    14.8K
    Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments
    13:00

    Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

    Published on: January 23, 2017

    10.0K

    Related Experiment Videos

    Last Updated: Oct 21, 2025

    Cross-Modal Multivariate Pattern Analysis
    13:51

    Cross-Modal Multivariate Pattern Analysis

    Published on: November 9, 2011

    20.1K
    Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example
    08:45

    Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

    Published on: October 24, 2012

    14.8K
    Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments
    13:00

    Measuring Attention and Visual Processing Speed by Model-based Analysis of Temporal-order Judgments

    Published on: January 23, 2017

    10.0K

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • Signal Processing

    Background:

    • Single-modality data is insufficient for comprehensive real-world semantics.
    • Audio-visual event localization requires matching audio and visual data for event identification.
    • Existing methods struggle with temporal inconsistencies in audio-visual scenes.

    Purpose of the Study:

    • To develop a method for audio-visual event localization that overcomes temporal inconsistencies.
    • To simulate human multi-modality perception for adaptive information filtering.
    • To improve the fusion of audio and visual signals for robust event localization.

    Main Methods:

    • Proposed a discriminative cross-modality attention network inspired by human perception.
    • Implemented adaptive attention mechanisms for 'where', 'when', and 'which' to attend.
    • Introduced a novel eigenvalue-based objective function for training and signal fusion.

    Main Results:

    • The network adaptively selects event-relevant information, even with significant temporal inconsistencies.
    • Achieved improved audio-visual signal fusion, yielding discriminative and nonlinear representations.
    • Systematically investigated temporal, weakly-supervised spatial, and cross-modality localization subtasks.

    Conclusions:

    • The proposed network effectively addresses temporal inconsistencies in audio-visual event localization.
    • The eigenvalue-based objective function enhances multi-modality representation and fusion.
    • The approach offers a more robust solution for complex audio-visual perception tasks.