Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Speech segregation based on sound localization.

Nicoleta Roman1, DeLiang Wang, Guy J Brown

  • 1Department of Computer and Information Science, The Ohio State University, Columbus, Ohio 43210, USA. niki@cis.ohio-state.edu

The Journal of the Acoustical Society of America
|November 1, 2003
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A speech prediction model based on codec modeling and transformer decoding.

Computer speech & language·2026
Same author

A Molecular Trimming Strategy for Hypoxia-Tolerant Photosensitizers With Enhanced cGAS-STING Activation.

Angewandte Chemie (International ed. in English)·2026
Same author

Towards decoupling frontend enhancement and backend recognition in monaural robust ASR.

Computer speech & language·2026
Same author

Efficacy of SWIM technology combined with direct aspiration first pass technique for large vessel occlusion in acute ischemic stroke.

American journal of translational research·2026
Same author

Sound-Based Sleep Staging using Pretrained Speech Foundation Models.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2025
Same author

Manipulating RTP properties of the same organic molecule by polymorphic engineering.

Chemical communications (Cambridge, England)·2025
Same journal

High-resolution depth estimation for multiple wideband sources in deep sea via sparse Bayesian learninga).

The Journal of the Acoustical Society of America·2026
Same journal

Depression markers in speech: An approach based on tract variables dynamics.

The Journal of the Acoustical Society of America·2026
Same journal

The oyster toadfish (Opsanus tau) alters active and diurnal calling amid vessel noise in New York City.

The Journal of the Acoustical Society of America·2026
Same journal

Experimental noise characterisation of phase-locked tandem-rotor in edgewise flight.

The Journal of the Acoustical Society of America·2026
Same journal

The tune-text-temporal synergy: Prosodic effects of final segmental weakening in Neapolitan.

The Journal of the Acoustical Society of America·2026
Same journal

Monitoring vessel movement above critical offshore infrastructure using distributed acoustic sensing.

The Journal of the Acoustical Society of America·2026
See all related articles

This study introduces a new supervised learning method for speech segregation, separating target voices from background noise using spatial cues. The approach effectively creates ideal binary masks, significantly improving speech intelligibility and recognition.

Area of Science:

  • Auditory Neuroscience
  • Signal Processing
  • Machine Learning

Background:

  • Simulating human auditory perception, specifically selective listening in noisy environments, presents a significant challenge.
  • Existing methods for speech segregation struggle to replicate the brain's ability to focus on a single voice amidst interference.

Purpose of the Study:

  • To develop a novel supervised learning approach for speech segregation using spatial localization cues.
  • To investigate the effectiveness of ideal time-frequency binary masks in separating target speech from interfering sounds.

Main Methods:

  • Utilizing interaural time differences (ITD) and interaural intensity differences (IID) as spatial localization cues.
  • Employing a pattern classification strategy in the binaural feature space to estimate ideal binary masks.

Related Experiment Videos

  • Motivated by the auditory masking effect and time-frequency (T-F) unit analysis.
  • Main Results:

    • The proposed system generates binary masks closely approximating ideal ones, validated by signal-to-noise ratio (SNR) and automatic speech recognition (ASR) performance.
    • Demonstrated significant performance improvements compared to existing speech segregation approaches.
    • Achieved substantial gains in speech intelligibility for normal-hearing listeners under specific conditions.

    Conclusions:

    • The supervised learning model effectively separates target speech using spatial cues and ideal binary masks.
    • The approach offers a promising solution for enhancing speech intelligibility and ASR in complex acoustic environments.
    • This method advances the simulation of human auditory selective attention capabilities.