Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Perceiving Loudness, Pitch, and Location01:21

Perceiving Loudness, Pitch, and Location

220
The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...
220
Pulse amplitude and quality01:17

Pulse amplitude and quality

1.8K
Pulse amplitude is a crucial indicator of cardiac health because it provides valuable insights into the strength of left ventricular contractions and the overall uniformity of blood circulation within the vasculature. The strength of the pulse is directly related to the force with which the heart contracts and the volume of blood being pumped.
A weak or absent pulse may indicate reduced cardiac output or poor left ventricular contraction, which can be signs of cardiovascular dysfunction or...
1.8K
Force Classification01:22

Force Classification

1.2K
Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...
1.2K
Unsoundness of Aggregate due to Volume Change01:26

Unsoundness of Aggregate due to Volume Change

115
Unsoundness in aggregates due to volume changes is primarily caused by the physical alterations aggregates undergo, such as freezing and thawing, thermal changes, and wetting and drying. Unsound aggregates, when subjected to these changes, result in volume change upon disintegration. This, in turn, contributes to the deterioration of concrete, including scaling, pop-outs, and cracking. Particular types of aggregates, such as porous flints, cherts, and those containing clay minerals, are...
115
Downsampling01:20

Downsampling

167
When considering a sampled sequence with zero values between sampling instants, one can replace it by taking every N-th value of the sequence. At these integer multiples of N, the original and sampled sequences coincide. This process, known as decimation, involves extracting every N-th sample from a sequence, thereby creating a more efficient sequence.
The Fourier transform of the decimated sequence reveals a combination of scaled and shifted versions of the original spectrum. This...
167
Classification of Signals01:30

Classification of Signals

484
In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...
484

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Capecitabine maintenance therapy in metastatic colorectal cancer patients with no evidence of disease: CAMCO trial.

Future oncology (London, England)·2023
Same author

Effectiveness of an online/offline mixed-mode Tai Chi cardiac rehabilitation program on microcirculation in patients with coronary artery disease: A randomized controlled study.

Clinical hemorheology and microcirculation·2023
Same author

Adverse effects of exposure to fine particles and ultrafine particles in the environment on different organs of organisms.

Journal of environmental sciences (China)·2023
Same author

Bipedicular percutaneous kyphoplasty versus unipedicular percutaneous kyphoplasty in the treatment of asymmetric osteoporotic vertebral compression fractures: a case control study.

BMC surgery·2023
Same author

Multimodality Driven Impedance-Based Sim2Real Transfer Learning for Robotic Multiple Peg-in-Hole Assembly.

IEEE transactions on cybernetics·2023
Same author

5-ALA Improves the Low Temperature Tolerance of Common Bean Seedlings through a Combination of Hormone Transduction Pathways and Chlorophyll Metabolism.

International journal of molecular sciences·2023
Same journal

<math></math> Estimation and Voicing Detection With Cascade Architecture in Noisy Speech.

IEEE/ACM transactions on audio, speech, and language processing·2025
Same journal

Speech Enhancement for Cochlear Implant Recipients using Deep Complex Convolution Transformer with Frequency Transformation.

IEEE/ACM transactions on audio, speech, and language processing·2025
Same journal

Selective Acoustic Feature Enhancement for Speech Emotion Recognition With Noisy Speech.

IEEE/ACM transactions on audio, speech, and language processing·2024
Same journal

Glottal Airflow Estimation using Neck Surface Acceleration and Low-Order Kalman Smoothing.

IEEE/ACM transactions on audio, speech, and language processing·2023
Same journal

Bilateral Cochlear Implant Processing of Coding Strategies With CCi-MOBILE, an Open-Source Research Platform.

IEEE/ACM transactions on audio, speech, and language processing·2023
Same journal

Attentive Training: A New Training Framework for Speech Enhancement.

IEEE/ACM transactions on audio, speech, and language processing·2023
See all related articles

Related Experiment Video

Updated: Jul 12, 2025

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.5K

Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection.

Jianwei Zhang1, Julie Liss2, Suren Jayasuriya3

  • 1School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ 85281, USA.

IEEE/ACM Transactions on Audio, Speech, and Language Processing
|October 30, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces a novel deep learning framework for accurate automatic detection of voice impairments (dysphonia). The method generates robust acoustic embeddings, improving performance across different datasets and conditions.

Keywords:
Dysphonic voicecontrastive lossembedding learning

More Related Videos

Author Spotlight: Advancements in the Fabrication of Synthetic Vocal Fold Models for Phonetic and Robotic Applications
06:24

Author Spotlight: Advancements in the Fabrication of Synthetic Vocal Fold Models for Phonetic and Robotic Applications

Published on: January 5, 2024

875
Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

464

Related Experiment Videos

Last Updated: Jul 12, 2025

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.5K
Author Spotlight: Advancements in the Fabrication of Synthetic Vocal Fold Models for Phonetic and Robotic Applications
06:24

Author Spotlight: Advancements in the Fabrication of Synthetic Vocal Fold Models for Phonetic and Robotic Applications

Published on: January 5, 2024

875
Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

464

Area of Science:

  • Speech processing
  • Biomedical engineering
  • Machine learning

Background:

  • Approximately 1.2% of the global population experiences impaired voice production, necessitating reliable automated assessment tools.
  • Current automated voice analysis methods often lack generalizability across different datasets and applications.
  • There is a significant need for robust and accurate methods for dysphonic voice detection.

Purpose of the Study:

  • To develop a deep learning framework for generating acoustic feature embeddings that are sensitive to vocal quality.
  • To enhance the robustness of voice analysis models across diverse corpora and conditions.
  • To improve the accuracy and generalizability of automatic dysphonic voice detection.

Main Methods:

  • A deep learning model was trained using a combination of contrastive and classification loss functions.
  • Data warping techniques were applied to input voice samples to increase model robustness.
  • The framework was designed to generate acoustic feature embeddings sensitive to voice quality.

Main Results:

  • The proposed method achieved high classification accuracy both within and across different corpora.
  • The generated embeddings demonstrated sensitivity to voice quality and robustness across varied datasets.
  • The model consistently outperformed three baseline methods on clean and deteriorated voice datasets.

Conclusions:

  • The developed deep learning framework offers a robust and accurate solution for automatic dysphonic voice detection.
  • The method's ability to generalize across corpora makes it suitable for diverse clinical and research applications.
  • The generated acoustic embeddings hold potential for further advancements in voice quality assessment.