Sensitivity of Acoustic Voice Quality Measures in Simulated Reverberation Conditions
View abstract on PubMed
Summary
This summary is machine-generated.Room reverberation significantly impacts voice analysis. Jitter, harmonic-to-noise ratio (HNR), and alpha ratio remain stable below 1s reverberation time, while shimmer is sensitive, and CPPs are consistently stable.
Area Of Science
- Acoustics
- Speech Science
- Signal Processing
Background
- Room reverberation distorts voice recordings, affecting the accuracy of voice quality and vocal health assessments.
- Computer analysis of voice is particularly sensitive to reverberation, necessitating an understanding of its impact on acoustic parameters.
Purpose Of The Study
- To quantify the effect of simulated room reverberation on common voice quality metrics.
- To assess the sensitivity and stability of jitter, shimmer, harmonic-to-noise ratio (HNR), alpha ratio, and smoothed cepstral peak prominence (CPPs) under varying reverberation conditions.
Main Methods
- Sustained [a:] vowel recordings from five healthy female speakers (comfortable and clear intents) were used.
- Eight levels of simulated reverberation (T20 from 0.004 to 1.82 s) were applied to clean recordings using Audacity.
- Voice samples were analyzed using PRAAT software to calculate jitter, shimmer, HNR, alpha ratio, and CPPs.
Main Results
- Jitter, HNR, and alpha ratio demonstrated stability at reverberation times (T20) below 1 second.
- HNR and jitter showed greater stability in the clear vocal style compared to the comfortable style.
- Shimmer was highly sensitive to reverberation (T20 of 0.53 s), while CPPs remained stable across all tested conditions.
Conclusions
- Voice metrics exhibit differential sensitivity to room reverberation.
- CPPs are a reliable metric even in reverberant conditions, suitable for less controlled environments.
- Cautious interpretation of shimmer is advised in reverberant settings, while jitter, HNR, and alpha ratio can be used with consideration of reverberation levels below 1s.
Related Concept Videos
The human ear cannot distinguish between two sources of sound if they happen to reach within a specific time interval, typically 0.1 seconds apart. More than this, and they are perceived as separate sources.
Imagine the sound is reflected back to the ears. Assuming that the source is very close to the human, the difference between hearing the two sounds—the emitted sound and the reflected sound—may be more than the minimum time for perceiving distinct sounds. If this is the case,...
Series resonance occurs in a circuit containing inductive (L), capacitive (C), and resistive (R) elements connected sequentially. At the resonance frequency, the inductive and capacitive reactances are equal in magnitude but opposite in sign, effectively canceling each other. This causes the circuit's impedance is minimal, primarily determined by the resistance R. The resonant frequency of an RLC circuit is defined as:
The power dissipation in the resistor is proportional to the square of...
Humans perceive sound by hearing. The human ear helps sound waves reach the brain, which then interprets the waves and creates the perception of hearing. The loudness of the environment in which a person is located determines whether they can distinguish between different sound sources.
The human ear can perceive an extensive range of sound intensity, necessitating the use of the logarithmic scale to define a physical quantity—the intensity level. It is a ratio of two intensities and...
The parallel RLC circuit is an arrangement where the resistor (R), inductor (L), and capacitor (C) are all connected to the same nodes and, as a result, share the same voltage across them. The parallel RLC circuit is analyzed in terms of admittance (Y), which reflects the ease with which current can flow. The admittance is given by:
Resonance in a parallel RLC circuit occurs when the net reactance is zero, meaning the capacitive and inductive effects cancel each other out. This condition is...
The human ear is not equally sensitive to all frequencies in the audible range. It may perceive sound waves with the same pressure but different frequencies as having different loudness. Moreover, the perception of sound waves depends on the health of an individual's ears, which decays with age. The health of one's ears may also be affected by regular exposure to loud noises.
The pitch of a sound depends on the frequency and the pressure amplitude of the source. Two sounds of the same...
The loudness of a sound source is related to how energetically the source is vibrating, consequently making the molecules of the propagation medium vibrate. To measure the loudness of a source, the physical quantity of interest is the intensity. This is defined as the energy emitted per unit of time per unit of area perpendicular to the sound wave's propagation direction. Since the total energy is greater if the source vibrates for a longer duration and over a larger area, dividing the...

