Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Difference from Background: Limit of Detection

Difference from Background: Limit of Detection

The limit of detection (LOD) is the smallest amount of analyte that can be distinguished from the background noise. The LOD value corresponds to the concentration at which the analyte signal is three times larger than the standard deviation of the blank signal. Below this value, the analyte signal cannot be differentiated from the background noise. It is calculated by dividing the calibration slope by 3 times the standard deviation of the blank signals.
The LOD indicates the presence or absence...

Perceiving Loudness, Pitch, and Location

Perceiving Loudness, Pitch, and Location

The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...

Root Mean Square

Root Mean Square

If in an experiment, data values have a probability of being both positive and negative, neither the arithmetic mean, the geometric mean, nor the harmonic mean can be used to calculate the central tendency of the data set. In particular, if the positive and negative values are equally likely, the arithmetic mean is close to zero.
For example, consider the velocity of gas molecules in a container. The gas molecules are moving in different directions, which might impart positive and negative...

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

Echo

Echo

The human ear cannot distinguish between two sources of sound if they happen to reach within a specific time interval, typically 0.1 seconds apart. More than this, and they are perceived as separate sources.
Imagine the sound is reflected back to the ears. Assuming that the source is very close to the human, the difference between hearing the two sounds—the emitted sound and the reflected sound—may be more than the minimum time for perceiving distinct sounds. If this is the case,...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Real-time control of a hearing instrument with EEG-based attention decoding.

Journal of neural engineering·2025

Same author

Immigrant Status Disparities in Hearing Health Care Use in the United States.

Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery·2024

Same author

Auditory stimulus-response modeling with a match-mismatch task.

Journal of neural engineering·2021

Same author

Rapid Ocular Responses Are Modulated by Bottom-up-Driven Auditory Salience.

The Journal of neuroscience : the official journal of the Society for Neuroscience·2019

Same journal

Amplitude-invariant phase masking for coherence recovery in scattered wavefields.

JASA express letters·2026

Same journal

Detecting continuous and discrete frequency changes as a function of spectral resolvability and modulation rate.

JASA express letters·2026

Same journal

Bearings-only acoustic source localization method using two distributed gliders and deep ocean experimental validation in the South China Sea.

JASA express letters·2026

Same journal

Block-sparse enhancement and detection of envelope modulation on noise for ship radiated noise.

JASA express letters·2026

Same journal

Predicting acoustic field with a separate variable ocean physics-informed neural network.

JASA express letters·2026

Same journal

Extending Sottek Hearing Model loudness to estimate partially-masked sound qualities of loudness, tonality, and sharpness.

JASA express letters·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 13, 2025

Systematic Hearing Performance Evaluation Process for Adolescents with Cochlear Implantation at Early Ages

Systematic Hearing Performance Evaluation Process for Adolescents with Cochlear Implantation at Early Ages

Published on: March 24, 2023

Comparing human and machine speech recognition in noise with QuickSIN.

Malcolm Slaney¹, Matthew B Fitzgerald²

¹Center for Computer Research in Music and Acoustics, Stanford University, Stanford, California 94305, USA.

JASA Express Letters

|September 9, 2024

Summary

This summary is machine-generated.

A new test evaluates automatic speech recognition systems in noise. Modern systems perform similarly to humans, ranging from normal to mildly impaired hearing in noisy conditions.

More Related Videos

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Related Experiment Videos

Last Updated: Jun 13, 2025

Systematic Hearing Performance Evaluation Process for Adolescents with Cochlear Implantation at Early Ages

Systematic Hearing Performance Evaluation Process for Adolescents with Cochlear Implantation at Early Ages

Published on: March 24, 2023

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Area of Science:

Speech processing
Human-computer interaction
Auditory perception

Background:

Speech recognition systems are crucial for human-computer interaction.
Evaluating speech recognition in noise is essential for real-world applications.
Human performance in noise provides a benchmark for system capabilities.

Purpose of the Study:

To propose a novel test for characterizing automatic speech recognition (ASR) system performance in noise.
To benchmark modern ASR systems against human performance using the QuickSIN test.
To establish a standardized metric for evaluating speech-in-noise recognition abilities of ASR.

Main Methods:

Utilized the QuickSIN (Quick Speech in Noise) test, commonly used in audiology.
Measured the signal-to-noise ratio (SNR) at which ASR systems achieve 50% keyword recognition.
Compared ASR performance in noise to established human performance data.

Main Results:

Modern ASR systems, trained on extensive unsupervised data, were evaluated.
ASR performance in noise varied, with some systems performing at a normal human level.
Other systems demonstrated mild impairment in noisy conditions compared to human participants.

Conclusions:

The proposed test effectively characterizes ASR performance in challenging acoustic environments.
Modern ASR systems exhibit human-like variability in speech recognition accuracy under noisy conditions.
Grounding ASR performance metrics to human abilities is vital for developing robust speech technologies.