Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Perceiving Loudness, Pitch, and Location

Perceiving Loudness, Pitch, and Location

The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...

Linear Approximation in Frequency Domain

Linear Approximation in Frequency Domain

Linear systems are characterized by two main properties: superposition and homogeneity. Superposition allows the response to multiple inputs to be the sum of the responses to each individual input. Homogeneity ensures that scaling an input by a scalar results in the response being scaled by the same scalar.
In contrast, nonlinear systems do not inherently possess these properties. However, for small deviations around an operating point, a nonlinear system can often be approximated as linear....

Perception of Sound Waves

Perception of Sound Waves

The human ear is not equally sensitive to all frequencies in the audible range. It may perceive sound waves with the same pressure but different frequencies as having different loudness. Moreover, the perception of sound waves depends on the health of an individual's ears, which decays with age. The health of one's ears may also be affected by regular exposure to loud noises.
The pitch of a sound depends on the frequency and the pressure amplitude of the source. Two sounds of the same...

¹³C NMR: Distortionless Enhancement by Polarization Transfer (DEPT)

¹³C NMR: Distortionless Enhancement by Polarization Transfer (DEPT)

When proton-coupled carbon-13 spectra are simplified by a broadband proton decoupling technique, structural information about the coupled protons is lost. Distortionless enhancement by polarization transfer (DEPT) is a technique that provides information on the number of hydrogens attached to each carbon in a molecule. While the DEPT experiment utilizes complex pulse sequences, the pulse delay and flip angle are specifically manipulated. The resulting signals have different phases depending on...

Chunking and Rehearsal in Sensory Memory

Chunking and Rehearsal in Sensory Memory

Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of...

Reconstruction of Signal using Interpolation

Reconstruction of Signal using Interpolation

Signal processing techniques are essential for accurately converting continuous signals to digital formats and vice versa. When a continuous signal is sampled with a period T, the resulting sampled signal exhibits replicas of the original spectrum in the frequency domain, spaced at intervals equal to the sampling frequency. To handle this sampled signal, a zero-order hold method can be applied, which creates a piecewise constant signal by retaining each sample's value until the next...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Putamen Structural-Functional Decoupling as an Early-Stage Candidate Imaging Marker for Motor Severity in Spinocerebellar Ataxia Type 3.

Movement disorders : official journal of the Movement Disorder Society·2026

Same author

Superior dorsal nigral hyperintensity depiction at 7 T MRI using CLEAR-DESS improves diagnosis performance of Parkinson's disease.

NPJ Parkinson's disease·2026

Same author

Speakers as sensors: Machine learning-based earphone impedance analysis for ear canal insertion depth prediction.

The Journal of the Acoustical Society of America·2026

Same author

Spatial-temporal activity-informed diarization and separation.

The Journal of the Acoustical Society of America·2025

Same author

A nested generalized sidelobe canceller for source counting, localization, and signal separation in reverberant fields.

The Journal of the Acoustical Society of America·2023

Same author

Multichannel room response equalization with a broadened control region using a linearly constrained approach and sensor interpolation.

The Journal of the Acoustical Society of America·2023

Same journal

High-resolution depth estimation for multiple wideband sources in deep sea via sparse Bayesian learninga).

The Journal of the Acoustical Society of America·2026

Same journal

Depression markers in speech: An approach based on tract variables dynamics.

The Journal of the Acoustical Society of America·2026

Same journal

The oyster toadfish (Opsanus tau) alters active and diurnal calling amid vessel noise in New York City.

The Journal of the Acoustical Society of America·2026

Same journal

Experimental noise characterisation of phase-locked tandem-rotor in edgewise flight.

The Journal of the Acoustical Society of America·2026

Same journal

The tune-text-temporal synergy: Prosodic effects of final segmental weakening in Neapolitan.

The Journal of the Acoustical Society of America·2026

Same journal

Monitoring vessel movement above critical offshore infrastructure using distributed acoustic sensing.

The Journal of the Acoustical Society of America·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 12, 2025

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Array configuration-agnostic personalized speech enhancement using long-short-term spatial coherence.

Yicheng Hsu¹, Yonghan Lee¹, Mingsian R Bai¹

¹Department of Power Mechanical Engineering, National Tsing Hua University, No. 101, Section 2, Kuang-Fu Road, Hsinchu 30044, Taiwan.

The Journal of the Acoustical Society of America

|October 20, 2023

Summary

This summary is machine-generated.

This study introduces a novel multichannel personalized speech enhancement (PSE) system. It effectively suppresses background noise using a new spatial feature, achieving superior performance without needing microphone array details.

More Related Videos

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

Combined Invasive Subcortical and Non-invasive Surface Neurophysiological Recordings for the Assessment of Cognitive and Emotional Functions in Humans

Combined Invasive Subcortical and Non-invasive Surface Neurophysiological Recordings for the Assessment of Cognitive and Emotional Functions in Humans

Published on: May 19, 2016

Related Experiment Videos

Last Updated: Jul 12, 2025

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

Combined Invasive Subcortical and Non-invasive Surface Neurophysiological Recordings for the Assessment of Cognitive and Emotional Functions in Humans

Combined Invasive Subcortical and Non-invasive Surface Neurophysiological Recordings for the Assessment of Cognitive and Emotional Functions in Humans

Published on: May 19, 2016

Area of Science:

Signal Processing
Acoustics
Machine Learning

Background:

Personalized speech enhancement (PSE) aims to suppress interfering speech.
Multichannel PSE systems offer advantages over single-channel methods in noisy environments.
Implementing multichannel PSE for diverse household microphone arrays is challenging.

Purpose of the Study:

To develop an array configuration-agnostic multichannel personalized speech enhancement (PSE) system.
To introduce a novel spatial feature for monitoring target speaker activity.
To evaluate the system's performance in adverse acoustic conditions.

Main Methods:

A novel spatial feature, long-short-term spatial coherence (LSTSC) with a dynamic forgetting factor, was defined.
A convolutional recurrent network was utilized with the LSTSC feature as input.
A simplified LSTSC feature was proposed to reduce computational cost.
Experiments compared the proposed PSE systems against baselines using unseen room responses and array configurations.

Main Results:

The proposed multichannel PSE network trained with LSTSC achieved superior speech enhancement.
The system demonstrated effectiveness without requiring precise knowledge of array configurations or room responses.
Both complete and simplified LSTSC versions showed significant performance improvements.

Conclusions:

The developed LSTSC feature enables array configuration-agnostic multichannel PSE.
The proposed system offers a robust solution for speech enhancement in complex acoustic environments.
This approach advances the practical application of multichannel PSE in real-world scenarios.