Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Sound Intensity Level00:53

Sound Intensity Level

Humans perceive sound by hearing. The human ear helps sound waves reach the brain, which then interprets the waves and creates the perception of hearing. The loudness of the environment in which a person is located determines whether they can distinguish between different sound sources.
The human ear can perceive an extensive range of sound intensity, necessitating the use of the logarithmic scale to define a physical quantity—the intensity level. It is a ratio of two intensities and hence a...
Sound Intensity00:58

Sound Intensity

The loudness of a sound source is related to how energetically the source is vibrating, consequently making the molecules of the propagation medium vibrate. To measure the loudness of a source, the physical quantity of interest is the intensity. This is defined as the energy emitted per unit of time per unit of area perpendicular to the sound wave's propagation direction. Since the total energy is greater if the source vibrates for a longer duration and over a larger area, dividing the emitted...
Perceiving Loudness, Pitch, and Location01:21

Perceiving Loudness, Pitch, and Location

The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by identifying...
Scaling01:26

Scaling

In designing and analyzing filters, resonant circuits, or circuit analysis at large, working with standard element values like 1 ohm, 1 henry, or 1 farad can be convenient before scaling these values to more realistic figures. This approach is widely utilized by not employing realistic element values in numerous examples and problems; it simplifies mastering circuit analysis through convenient component values. The complexity of calculations is thereby reduced, with the understanding that...
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).
Sound as Pressure Waves01:17

Sound as Pressure Waves

Sound waves, which are longitudinal waves, can be modeled as the displacement amplitude varying as a function of the spatial and temporal coordinates. As a column of the medium is displaced, its successive columns are also displaced. As the successive displacements differ relatively, a pressure difference with the surrounding pressure is created. The gauge pressure varies across the medium.
The pressure fluctuation depends on the difference in displacements between the successive points in the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Transition of the presynaptic vesicle cluster from a compact to dispersed organization during long-term potentiation.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same author

Synaptic spine head morphodynamics from graph grammar rules for actin dynamics.

bioRxiv : the preprint server for biology·2026
Same author

Dynamical mechanisms for coordinating long-term working memory based on the precision of spike-timing in cortical neurons.

Biological cybernetics·2026
Same author

Balancing Stability and Flow in Hippocampal Networks via Inductive Bias and Learned Symmetry Breaking.

bioRxiv : the preprint server for biology·2026
Same author

A Spatiotemporal Perspective on Dynamical Computation in Neural Information Processing Systems.

ArXiv·2026
Same author

Neuromodulators Generate Multiple Context-Relevant Behaviors in Recurrent Neural Networks.

Neural computation·2026
Same journal

A Multistream Feature Framework Based on Bandpass Modulation Filtering for Robust Speech Recognition.

IEEE transactions on audio, speech, and language processing·2018
Same journal

Efficient Approximation of Head-Related Transfer Functions in Subbands for Accurate Sound Localization.

IEEE transactions on audio, speech, and language processing·2015
Same journal

Subglottal Impedance-Based Inverse Filtering of Voiced Sounds Using Neck Surface Acceleration.

IEEE transactions on audio, speech, and language processing·2014
Same journal

A Dual-Microphone Speech Enhancement Algorithm Based on the Coherence Function.

IEEE transactions on audio, speech, and language processing·2011
Same journal

Spoken Language Derived Measures for Detecting Mild Cognitive Impairment.

IEEE transactions on audio, speech, and language processing·2011
Same journal

Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions.

IEEE transactions on audio, speech, and language processing·2011
See all related articles

Related Experiment Video

Updated: Jun 4, 2026

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Speech Enhancement Using Gaussian Scale Mixture Models.

Jiucang Hao1, Te-Won Lee, Terrence J Sejnowski

  • 1Computational Neurobiology Laboratory, Salk Institute, La Jolla, CA 92037 USA, and also with the Institute for Neural Computation, University of California, San Diego, CA 92093 USA.

IEEE Transactions on Audio, Speech, and Language Processing
|March 2, 2011
PubMed
Summary
This summary is machine-generated.

This study introduces a new probabilistic method for speech enhancement, modeling speech signals using Gaussian scale mixture models (GSMM). The approach improves signal-to-noise ratio and reduces word error rates, effectively suppressing speech-shaped noise.

Related Experiment Videos

Last Updated: Jun 4, 2026

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Area of Science:

  • Signal Processing
  • Machine Learning
  • Acoustics

Background:

  • Traditional speech enhancement often relies on deterministic models.
  • Accurate modeling of speech signal characteristics is crucial for effective noise reduction.

Purpose of the Study:

  • To propose a novel probabilistic approach for speech enhancement.
  • To model speech signals in the log-spectral domain using Gaussian mixture models (GMM) and Gaussian scale mixture models (GSMM).

Main Methods:

  • Developed a probabilistic relationship between frequency coefficients and log-spectra.
  • Employed Expectation-Maximization (EM) for GSMM training and Bayesian inference for posterior distribution computation.
  • Implemented Laplace method and variational approximation for computational efficiency.

Main Results:

  • Reconstructed signals from estimated frequency coefficients yielded higher signal-to-noise ratio (SNR).
  • Reconstructed signals from estimated log-spectra resulted in lower word recognition error rates.
  • Successfully reduced speech-shaped noise (SSN), outperforming spectral analysis methods.

Conclusions:

  • The proposed probabilistic GSMM approach offers significant improvements in speech enhancement.
  • The developed approximations provide efficient and effective methods for noise reduction.
  • This method is particularly effective against speech-shaped noise.