Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Downsampling01:20

Downsampling

209
When considering a sampled sequence with zero values between sampling instants, one can replace it by taking every N-th value of the sequence. At these integer multiples of N, the original and sampled sequences coincide. This process, known as decimation, involves extracting every N-th sample from a sequence, thereby creating a more efficient sequence.
The Fourier transform of the decimated sequence reveals a combination of scaled and shifted versions of the original spectrum. This...
209
Upsampling01:22

Upsampling

275
Managing signal sampling rates is essential in digital signal processing to maintain signal integrity. A decimated signal, characterized by a reduced frequency range due to its lower sampling rate, can be upsampled by inserting zeros between each sample. This upsampling process expands the original spectrum and introduces repeated spectral replicas at intervals dictated by the new Nyquist frequency. To refine this zero-inserted sequence, it is passed through a lowpass filter with a cutoff...
275
Buffer Effectiveness02:19

Buffer Effectiveness

49.3K
Buffer solutions do not have an unlimited capacity to keep the pH relatively constant . Instead, the ability of a buffer solution to resist changes in pH relies on the presence of appreciable amounts of its conjugate weak acid-base pair. When enough strong acid or base is added to substantially lower the concentration of either member of the buffer pair, the buffering action within the solution is compromised.
The buffer capacity is the amount of acid or base that can be added to a given volume...
49.3K
Reconstruction of Signal using Interpolation01:10

Reconstruction of Signal using Interpolation

270
Signal processing techniques are essential for accurately converting continuous signals to digital formats and vice versa. When a continuous signal is sampled with a period T, the resulting sampled signal exhibits replicas of the original spectrum in the frequency domain, spaced at intervals equal to the sampling frequency. To handle this sampled signal, a zero-order hold method can be applied, which creates a piecewise constant signal by retaining each sample's value until the next...
270
Pulse amplitude and quality01:17

Pulse amplitude and quality

1.9K
Pulse amplitude is a crucial indicator of cardiac health because it provides valuable insights into the strength of left ventricular contractions and the overall uniformity of blood circulation within the vasculature. The strength of the pulse is directly related to the force with which the heart contracts and the volume of blood being pumped.
A weak or absent pulse may indicate reduced cardiac output or poor left ventricular contraction, which can be signs of cardiovascular dysfunction or...
1.9K
Reducing Line Loss01:18

Reducing Line Loss

184
In a three-phase circuit, line loss is an indicator of energy dissipated as heat due to the resistance of transmission lines. To address this, incorporating transformers into the system—a step-up transformer at the source and a step-down transformer at the load—is a strategic solution. Two three-phase transformers are introduced to improve this.
With a step-up transformer at the source, the voltage is increased, thereby reducing the current in the transmission lines since power loss...
184

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

An Investigation of the Phonetic Variation of the Word-Initial /l/ and /n/ Across Regional Varieties of Mandarin.

Language and speech·2026
Same author

Diffusing caveolin-1 scaffolds regulate mechanosignalling.

Nature cell biology·2026
Same author

HDCluster: High-Degree Graph Clustering for Robust Analysis of Single Molecule Localization Microscopy.

bioRxiv : the preprint server for biology·2025
Same author

nERdy: network analysis of endoplasmic reticulum dynamics.

Communications biology·2025
Same author

The Interaction of Target and Masker Speech in Competing Speech Perception.

Brain sciences·2025
Same author

Physician-in-the-Loop Active Learning in Radiology Artificial Intelligence Workflows: Opportunities, Challenges, and Future Directions.

AJR. American journal of roentgenology·2025
Same journal

Retraction Note: An adaptive speech signal processing for COVID-19 detection using deep learning approach.

International journal of speech technology·2022
Same journal

The perception of emotional cues by children in artificial background noise.

International journal of speech technology·2021
Same journal

An adaptive speech signal processing for COVID-19 detection using deep learning approach.

International journal of speech technology·2021
Same journal

A novel stochastic deep resilient network for effective speech recognition.

International journal of speech technology·2021
Same journal

RETRACTED ARTICLE: AI driven feature extraction model for chest cavity spectrum signal visualization.

International journal of speech technology·2021
Same journal

Public opinion mining using natural language processing technique for improvisation towards smart city.

International journal of speech technology·2020
See all related articles

Related Experiment Video

Updated: Aug 4, 2025

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.6K

Plain-to-clear speech video conversion for enhanced intelligibility.

Shubam Sachdeva1, Haoyao Ruan1, Ghassan Hamarneh2

  • 1Language and Brain Lab, Department of Linguistics, Simon Fraser University, Burnaby, BC Canada.

International Journal of Speech Technology
|April 3, 2023
PubMed
Summary
This summary is machine-generated.

Researchers enhanced visual speech cues in videos to improve speech intelligibility. Modifying plain speech videos with clear speech features boosted AI lip-reading accuracy and shows potential for human training.

Keywords:
AI lip readingIntelligibilitySpeech enhancementSpeech styleVideo speech synthesis

More Related Videos

Systematic Hearing Performance Evaluation Process for Adolescents with Cochlear Implantation at Early Ages
06:04

Systematic Hearing Performance Evaluation Process for Adolescents with Cochlear Implantation at Early Ages

Published on: March 24, 2023

426
A Protocol for Comprehensive Assessment of Bulbar Dysfunction in Amyotrophic Lateral Sclerosis ALS
12:43

A Protocol for Comprehensive Assessment of Bulbar Dysfunction in Amyotrophic Lateral Sclerosis ALS

Published on: February 21, 2011

34.9K

Related Experiment Videos

Last Updated: Aug 4, 2025

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.6K
Systematic Hearing Performance Evaluation Process for Adolescents with Cochlear Implantation at Early Ages
06:04

Systematic Hearing Performance Evaluation Process for Adolescents with Cochlear Implantation at Early Ages

Published on: March 24, 2023

426
A Protocol for Comprehensive Assessment of Bulbar Dysfunction in Amyotrophic Lateral Sclerosis ALS
12:43

A Protocol for Comprehensive Assessment of Bulbar Dysfunction in Amyotrophic Lateral Sclerosis ALS

Published on: February 21, 2011

34.9K

Area of Science:

  • Speech processing and computer vision
  • Human-computer interaction
  • Audiology and speech-language pathology

Background:

  • Clearly articulated speech demonstrably improves intelligibility compared to plain speech.
  • Visual speech cues in video-only formats are crucial for speech perception, especially in noisy environments.
  • Systematic modification of visual speech features to enhance intelligibility remains an underexplored area.

Purpose of the Study:

  • To investigate the systematic modification of visual speech cues to enhance clear-speech features.
  • To improve speech intelligibility using synthesized clear-speech videos derived from plain speech.
  • To evaluate the effectiveness of these synthesized videos using both AI lip-reading and human intelligibility tests.

Main Methods:

  • Extraction of clear-speech visual features from videos of English words with varying vowels.
  • Application of extracted features to plain speech videos using an image-warping technique with a 'displacement factor'.
  • Synthesis of novel clear-speech videos and evaluation via state-of-the-art AI lip readers and human participants.

Main Results:

  • Successfully extracted and applied visual cues to enhance speech intelligibility for AI lip readers.
  • Demonstrated that universal, talker-independent clear-speech features can modify visual speech styles.
  • Introduced the 'displacement factor' for quantifiable scaling of visual modifications between speech styles.

Conclusions:

  • Systematic enhancement of visual speech features can significantly improve AI-based speech intelligibility.
  • The findings suggest the potential for talker-independent visual speech modification techniques.
  • Generated high-definition videos are suitable for future human-centric intelligibility and perceptual training studies.