Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Generalization, Discrimination, and Extinction01:24

Generalization, Discrimination, and Extinction

1.6K
Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...
1.6K
Chunking and Rehearsal in Sensory Memory01:22

Chunking and Rehearsal in Sensory Memory

640
Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of...
640
Long-Term Memory01:18

Long-Term Memory

758
Long-term memory is a relatively permanent type of memory, capable of storing vast amounts of information over extended periods. Its storage capacity is generally considered unlimited.
Long-term memory can be categorized into two primary types: explicit and implicit memory. Explicit memory, also known as declarative memory, involves the conscious recollection of information that we deliberately try to remember, recall, and articulate. This type of memory encompasses specific facts, events, and...
758

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A speech prediction model based on codec modeling and transformer decoding.

Computer speech & language·2026
Same author

A Molecular Trimming Strategy for Hypoxia-Tolerant Photosensitizers With Enhanced cGAS-STING Activation.

Angewandte Chemie (International ed. in English)·2026
Same author

Towards decoupling frontend enhancement and backend recognition in monaural robust ASR.

Computer speech & language·2026
Same author

Efficacy of SWIM technology combined with direct aspiration first pass technique for large vessel occlusion in acute ischemic stroke.

American journal of translational research·2026
Same author

Re-Emergence and Characterization of a Highly Pathogenic Getah Virus on a Pig Farm in Guangdong Province, China.

Microorganisms·2026
Same author

Assembly and analysis of the complete mitochondrial genome of endangered plant <i>Tilia amurensis</i> Rupr.

Frontiers in plant science·2025
Same journal

Reducing computational complexity in adaptive sound zones with online room impulse response estimation.

The Journal of the Acoustical Society of America·2026
Same journal

Small-sample unbiased linear coherence estimators for a complex Gaussian random process.

The Journal of the Acoustical Society of America·2026
Same journal

Automated detection and annotation of toothed-whale whistles using transformer-based instance segmentation.

The Journal of the Acoustical Society of America·2026
Same journal

Effect of temperature and concentration on the thermo-acoustic behavior of vitamin B5 (d-Panthenol) solutions in the presence of glycol additives.

The Journal of the Acoustical Society of America·2026
Same journal

The visome: Using cognitive networks to examine lip-reading errors in English words.

The Journal of the Acoustical Society of America·2026
Same journal

Resident subjective annoyance responses to combined road traffic and train-induced structure-borne noise: Effects of sound environment.

The Journal of the Acoustical Society of America·2026
See all related articles

Related Experiment Video

Updated: Feb 27, 2026

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

2.0K

Long short-term memory for speaker generalization in supervised speech separation.

Jitong Chen1, DeLiang Wang1

  • 1Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210, USA.

The Journal of the Acoustical Society of America
|July 7, 2017
PubMed
Summary
This summary is machine-generated.

This study introduces a Long Short-Term Memory (LSTM) model for speech separation, significantly improving performance on unseen speakers and noises compared to deep neural networks (DNNs). The LSTM model enhances speech intelligibility and is efficient for low-latency applications.

More Related Videos

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

929
Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention
04:32

Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention

Published on: December 20, 2024

936

Related Experiment Videos

Last Updated: Feb 27, 2026

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

2.0K
Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

929
Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention
04:32

Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention

Published on: December 20, 2024

936

Area of Science:

  • Signal Processing
  • Machine Learning
  • Acoustics

Background:

  • Speech separation is crucial for enhancing audio quality in noisy environments.
  • Supervised speech separation models struggle with generalization to new noises and speakers.
  • Deep Neural Networks (DNNs) show promise but have limitations in modeling diverse speaker characteristics.

Purpose of the Study:

  • To develop a speech separation model that improves generalization to unseen speakers and noises.
  • To leverage the temporal modeling capabilities of Long Short-Term Memory (LSTM) networks for enhanced speaker generalization.
  • To evaluate the proposed LSTM-based model against DNN-based approaches for speech intelligibility and efficiency.

Main Methods:

  • Formulating speech separation as estimating a time-frequency mask from acoustic features.
  • Developing a novel speech separation model utilizing Long Short-Term Memory (LSTM) architecture.
  • Conducting systematic evaluations comparing the LSTM model with a DNN-based model on objective speech intelligibility metrics.

Main Results:

  • The proposed LSTM model significantly outperforms the DNN-based model on unseen speakers and noises.
  • LSTM's internal representations demonstrate effective capture of long-term speech contexts.
  • The LSTM model shows advantages for low-latency speech separation, even without future frame information.

Conclusions:

  • The LSTM-based model offers an effective solution for speaker- and noise-independent speech separation.
  • LSTM networks provide superior generalization capabilities compared to traditional DNNs for this task.
  • The proposed approach is promising for real-time and robust speech enhancement applications.