Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Interference: Path Lengths01:10

Interference: Path Lengths

2.4K
Consider two sources of sound, that may or may not be in phase, emitting waves at a single frequency, and consider the frequencies to be the same.
Two special sources may be considered when they are in phase. This can be easily achieved by feeding the two sources from the same source. An example would be synchronizing the two speakers by feeding them with the same source, such as the sound waves produced by a tuning fork. This setup ensures that the two sources have the same frequency and are...
2.4K
Sound Waves: Interference00:53

Sound Waves: Interference

5.1K
Sound waves can be modeled either as longitudinal waves, wherein the molecules of the medium oscillate around an equilibrium position, or as pressure waves. When two identical waves from the same source superimpose on each other, the combination of two crests or two troughs results in amplitude reinforcement known as constructive interference. If two identical waves, that are initially in phase, become out of phase because of different path lengths, the combination of crests with troughs...
5.1K
¹H NMR: Interpreting Distorted and Overlapping Signals01:02

¹H NMR: Interpreting Distorted and Overlapping Signals

1.7K
Spin systems where the difference in chemical shifts of the coupled nuclei is greater than ten times J are called first-order spin systems. These nuclei are weakly coupled, and their chemical shifts and coupling constant can generally be estimated from the well-separated signals in the spectrum.
As Δν decreases and the signals move closer, the doublets appear increasingly distorted. The intensities of the inner lines increase at the cost of those of the outer lines as the signals are...
1.7K
Difference from Background: Limit of Detection01:05

Difference from Background: Limit of Detection

8.9K
The limit of detection (LOD) is the smallest amount of analyte that can be distinguished from the background noise. The LOD value corresponds to the concentration at which the analyte signal is three times larger than the standard deviation of the blank signal. Below this value, the analyte signal cannot be differentiated from the background noise. It is calculated by dividing the calibration slope by 3 times the standard deviation of the blank signals.
The LOD indicates the presence or absence...
8.9K
Perceiving Loudness, Pitch, and Location01:21

Perceiving Loudness, Pitch, and Location

1.3K
The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...
1.3K
IR Spectrum Peak Splitting: Symmetric vs Asymmetric Vibrations01:08

IR Spectrum Peak Splitting: Symmetric vs Asymmetric Vibrations

2.2K
Identical bonds within a polyatomic group can stretch symmetrically (in-phase) or asymmetrically (out-of-phase). Similar to hydrogen bonding, these vibrations also influence the shape of the IR peak. Generally, asymmetric stretching frequencies are higher than symmetric stretching frequencies. For example, primary amines exhibit two distinct IR peaks between 3300–3500 cm−1 corresponding to the symmetric and asymmetric N-H stretching, while secondary amines exhibit a single...
2.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A speech prediction model based on codec modeling and transformer decoding.

Computer speech & language·2026
Same author

A Molecular Trimming Strategy for Hypoxia-Tolerant Photosensitizers With Enhanced cGAS-STING Activation.

Angewandte Chemie (International ed. in English)·2026
Same author

Towards decoupling frontend enhancement and backend recognition in monaural robust ASR.

Computer speech & language·2026
Same author

Efficacy of SWIM technology combined with direct aspiration first pass technique for large vessel occlusion in acute ischemic stroke.

American journal of translational research·2026
Same author

Re-Emergence and Characterization of a Highly Pathogenic Getah Virus on a Pig Farm in Guangdong Province, China.

Microorganisms·2026
Same author

Assembly and analysis of the complete mitochondrial genome of endangered plant <i>Tilia amurensis</i> Rupr.

Frontiers in plant science·2025
Same journal

Read speech voice quality and disfluency in individuals with recent suicidal ideation or suicide attempt.

Speech communication·2026
Same journal

Speechformer-CTC: Sequential Modeling of Depression Detection with Speech Temporal Classification.

Speech communication·2024
Same journal

Temporal envelope cues and simulations of cochlear implant signal processing.

Speech communication·2024
Same journal

Post-Processing Automatic Transcriptions with Machine Learning for Verbal Fluency Scoring.

Speech communication·2024
Same journal

Analysis of acoustic and voice quality features for the classification of infant and mother vocalizations.

Speech communication·2022
Same journal

Audibility emphasis of low-level sounds improves consonant identification while preserving vowel identification for cochlear implant users.

Speech communication·2022
See all related articles

Related Experiment Video

Updated: Mar 25, 2026

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

2.1K

Noise Perturbation for Supervised Speech Separation.

Jitong Chen1, Yuxuan Wang1, DeLiang Wang2

  • 1Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210.

Speech Communication
|February 23, 2016
PubMed
Summary
This summary is machine-generated.

Improving speech separation involves training classifiers with perturbed noise. Frequency perturbation proved most effective, reducing misclassification of noise as speech in low signal-to-noise ratio conditions.

Keywords:
Speech separationnoise perturbationsupervised learning

More Related Videos

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

967
Targeted Training of Ultrasonic Vocalizations in Aged and Parkinsonian Rats
11:00

Targeted Training of Ultrasonic Vocalizations in Aged and Parkinsonian Rats

Published on: August 8, 2011

20.3K

Related Experiment Videos

Last Updated: Mar 25, 2026

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

2.1K
Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

967
Targeted Training of Ultrasonic Vocalizations in Aged and Parkinsonian Rats
11:00

Targeted Training of Ultrasonic Vocalizations in Aged and Parkinsonian Rats

Published on: August 8, 2011

20.3K

Area of Science:

  • Signal Processing
  • Machine Learning
  • Acoustics

Background:

  • Speech separation is crucial for audio processing, often framed as mask estimation.
  • Supervised methods require effective generalization from limited training data.
  • Nonstationary noise can cause classifiers to misidentify noise patterns as speech.

Purpose of the Study:

  • To investigate the impact of noise perturbations on supervised speech separation performance.
  • To evaluate three specific noise perturbations: rate, vocal tract length, and frequency.
  • To determine the optimal perturbation strategy for low signal-to-noise ratios (SNRs).

Main Methods:

  • Trained a classifier on speech and noise mixtures with introduced noise perturbations.
  • Applied noise rate, vocal tract length, and frequency perturbations.
  • Evaluated separation performance using classification accuracy, hit-minus-false-alarm rate, and short-time objective intelligibility (STOI).

Main Results:

  • Frequency perturbation demonstrated superior performance compared to noise rate and vocal tract length perturbations.
  • Frequency perturbation significantly reduced the misclassification of noise patterns as speech.
  • All evaluated metrics showed improvement with frequency perturbation at low SNRs.

Conclusions:

  • Noise perturbation, particularly frequency perturbation, enhances supervised speech separation.
  • Frequency perturbation is an effective technique for improving classifier robustness against nonstationary noise.
  • This method offers a promising approach for better speech separation in challenging acoustic environments.