Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Recognition of speech spectrograms.

B G Greene, D B Pisoni, T D Carrell

The Journal of the Acoustical Society of America

|July 1, 1984

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Auditory short-term memory and vowel perception.

Memory & cognition·2013

Same author

Central auditory system plasticity associated with speech discrimination training.

Journal of cognitive neuroscience·2013

Same author

Development of visual attention skills in prelingually deaf children who use cochlear implants.

Ear and hearing·2005

Same author

Behavioral inhibition and clinical outcomes in children with cochlear implants.

The Laryngoscope·2005

Same author

Some measures of verbal and spatial working memory in eight- and nine-year-old hearing-impaired children with cochlear implants.

Ear and hearing·2001

Same author

Audio-visual perception of sinewave speech in an adult cochlear implant user: a case study.

Ear and hearing·2001

Same journal

High-resolution depth estimation for multiple wideband sources in deep sea via sparse Bayesian learninga).

The Journal of the Acoustical Society of America·2026

Same journal

Depression markers in speech: An approach based on tract variables dynamics.

The Journal of the Acoustical Society of America·2026

Same journal

The oyster toadfish (Opsanus tau) alters active and diurnal calling amid vessel noise in New York City.

The Journal of the Acoustical Society of America·2026

Same journal

Experimental noise characterisation of phase-locked tandem-rotor in edgewise flight.

The Journal of the Acoustical Society of America·2026

Same journal

The tune-text-temporal synergy: Prosodic effects of final segmental weakening in Neapolitan.

The Journal of the Acoustical Society of America·2026

Same journal

Monitoring vessel movement above critical offshore infrastructure using distributed acoustic sensing.

The Journal of the Acoustical Society of America·2026

See all related articles

Naive observers can learn to identify speech spectrograms with high accuracy after about 20 hours of training. They generalize this visual speech recognition skill to different talkers and novel words, relying on visual phonetic features.

Area of Science:

Auditory Perception
Speech Processing
Visual Learning

Background:

Speech spectrograms offer a visual representation of spoken language.
Understanding how humans learn to interpret visual speech is crucial for human-computer interaction and speech pathology.

Purpose of the Study:

To investigate the learning curve and generalization capabilities of naive observers in identifying speech spectrograms.
To determine the accuracy and factors influencing visual speech recognition in untrained individuals.

Main Methods:

Eight naive observers were trained for approximately 20 hours to identify 50 phonetically balanced words from spectrograms.
Identification tests were conducted immediately after daily training sessions.
Generalization tests involved novel tokens from the same and different talkers (male, female, synthetic) and a new word set.

Related Experiment Videos

Main Results:

Subjects achieved over 95% accuracy in identifying trained words from a single talker after 20 hours.
Generalization accuracy was 91% for the original talker, 76% for a new male and female talker, and 48% for a synthetic talker.
Subjects utilized salient visual correlates of phonetic features for identification, demonstrating abstraction of perceptual strategies.

Conclusions:

Naive observers can rapidly learn to identify speech spectrograms with high accuracy without prior phonetic or acoustic training.
Learned visual speech recognition generalizes to different talkers and novel words, though performance varies.
Perceptual strategies involve abstracting visual features corresponding to phonetic elements.