Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Sound Intensity Level

Sound Intensity Level

Humans perceive sound by hearing. The human ear helps sound waves reach the brain, which then interprets the waves and creates the perception of hearing. The loudness of the environment in which a person is located determines whether they can distinguish between different sound sources.
The human ear can perceive an extensive range of sound intensity, necessitating the use of the logarithmic scale to define a physical quantity—the intensity level. It is a ratio of two intensities and hence a...

Sound Intensity

Sound Intensity

The loudness of a sound source is related to how energetically the source is vibrating, consequently making the molecules of the propagation medium vibrate. To measure the loudness of a source, the physical quantity of interest is the intensity. This is defined as the energy emitted per unit of time per unit of area perpendicular to the sound wave's propagation direction. Since the total energy is greater if the source vibrates for a longer duration and over a larger area, dividing the emitted...

Chunking and Rehearsal in Sensory Memory

Chunking and Rehearsal in Sensory Memory

Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of information more...

Perceiving Loudness, Pitch, and Location

Perceiving Loudness, Pitch, and Location

The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by identifying...

Auditory Pathway

Auditory Pathway

Auditory pathways constitute the complex neural circuits responsible for transmitting and interpreting auditory information from the peripheral auditory system to the brain. Sound waves are initially captured by the outer ear, funneled through the ear canal, and reach the tympanic membrane (eardrum). These vibrations are transmitted via the middle ear's ossicles to the inner ear's cochlea.
When viewed cross-sectionally, the cochlea reveals the scala vestibuli and scala tympani flanking the...

Auditory Perception

Auditory Perception

The auditory system is essential for sound perception, utilizing various critical structures. When sound waves enter the outer ear, they travel through the ear canal and cause the eardrum to vibrate. These vibrations are then transmitted to the middle ear, where three tiny bones – the malleus, incus, and stapes – amplify the sound. This amplification is crucial, as it ensures that the sound vibrations are strong enough to be conveyed to the inner ear. These vibrations then reach the cochlea, a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Identifying Hearing Difficulty Moments in Conversational Audio.

Trends in hearing·2026

Same author

A foundation model for continuous glucose monitoring data.

Nature·2026

Same author

Identification of CT radiomic features robust to acquisition and segmentation variations for improved prediction of radiotherapy-treated lung cancer patient recurrence.

Scientific reports·2024

Same author

Multicentric development and evaluation of [<sup>18</sup>F]FDG PET/CT and CT radiomic models to predict regional and/or distant recurrence in early-stage non-small cell lung cancer treated by stereotactic body radiation therapy.

European journal of nuclear medicine and molecular imaging·2023

Same author

Tight bounds for the median of a gamma distribution.

PloS one·2023

Same author

Development and prospective validation of a spatial dose pattern based model predicting acute pulmonary toxicity in patients treated with volumetric arc-therapy for locally advanced lung cancer.

Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology·2021

Same journal

A Model-Free Reinforcement Learning Implementation of Decision Making Under Uncertainty by Sequential Sampling.

Neural computation·2026

Same journal

DROP: Distributional and Regular Optimism and Pessimism for Reinforcement Learning.

Neural computation·2026

Same journal

Hierarchical Active Inference Using Successor Representations.

Neural computation·2026

Same journal

W-Kernel and Its Principal Space for Frequentist Evaluation of Bayesian Estimators.

Neural computation·2026

Same journal

A Hidden Markov Model-Inspired Sequence Classification Method for Hyperdimensional Computing.

Neural computation·2026

Same journal

Sparse Graphical Modeling for Electrophysiological Phase-Based Connectivity Using Circular Statistics.

Neural computation·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 12, 2026

An Automated System for Sound Localization Testing in Hearing-Impaired Listeners

An Automated System for Sound Localization Testing in Hearing-Impaired Listeners

Published on: March 13, 2026

Sound retrieval and ranking using sparse auditory representations.

Richard F Lyon¹, Martin Rehn, Samy Bengio

¹Google, Mountain View, CA 94043, USA. dicklyon@google.com

Neural Computation

|June 24, 2010

Summary

This summary is machine-generated.

New auditory models significantly outperform traditional methods for sound recognition. These advanced models improve sound classification accuracy by 18% in large-scale tests, enhancing machine understanding of everyday sounds.

More Related Videos

Memorization-Based Training and Testing Paradigm for Robust Vocal Identity Recognition in Expressive Speech Using Event-Related Potentials Analysis

Memorization-Based Training and Testing Paradigm for Robust Vocal Identity Recognition in Expressive Speech Using Event-Related Potentials Analysis

Published on: August 9, 2024

Assessment of Audio-Tactile Sensory Substitution Training in Participants with Profound Deafness Using the Event-Related Potential Technique

Assessment of Audio-Tactile Sensory Substitution Training in Participants with Profound Deafness Using the Event-Related Potential Technique

Published on: September 7, 2022

Related Experiment Videos

Last Updated: Jun 12, 2026

An Automated System for Sound Localization Testing in Hearing-Impaired Listeners

An Automated System for Sound Localization Testing in Hearing-Impaired Listeners

Published on: March 13, 2026

Memorization-Based Training and Testing Paradigm for Robust Vocal Identity Recognition in Expressive Speech Using Event-Related Potentials Analysis

Memorization-Based Training and Testing Paradigm for Robust Vocal Identity Recognition in Expressive Speech Using Event-Related Potentials Analysis

Published on: August 9, 2024

Assessment of Audio-Tactile Sensory Substitution Training in Participants with Profound Deafness Using the Event-Related Potential Technique

Assessment of Audio-Tactile Sensory Substitution Training in Participants with Profound Deafness Using the Event-Related Potential Technique

Published on: September 7, 2022

Area of Science:

Acoustics and Signal Processing
Machine Learning for Audio Analysis
Computational Auditory Scene Analysis

Background:

Effective sound representation is crucial for systems understanding human auditory environments.
Evaluating sound representations requires large-scale, quantitative frameworks.
Machine vision techniques can be adapted for audio processing tasks.

Purpose of the Study:

To quantitatively evaluate different sound representations in a large-scale sound-ranking task.
To compare novel auditory models against conventional Mel-frequency cepstral coefficients (MFCCs).
To adapt and apply the passive-aggressive model for image retrieval (PAMIR) to audio feature extraction.

Main Methods:

Utilized a sound-ranking framework adapted from the passive-aggressive model for image retrieval (PAMIR).
Compared adaptive pole-zero filter cascade (PZFC) auditory filter banks with sparse-code feature extraction.
Evaluated stabilized auditory images with multiple vector quantizers against conventional MFCC front ends.

Main Results:

Auditory models demonstrated a significant advantage over vector-quantized MFCCs.
The best auditory model achieved 73% precision at top-1 and 35% average precision.
This represents an 18% improvement compared to the best-performing MFCC front end.

Conclusions:

Advanced auditory models offer superior performance for large-scale sound recognition tasks.
Sparse-code feature extraction from auditory images provides a more discriminative representation than MFCCs.
The PAMIR framework is effective for evaluating and developing robust auditory feature representations.