Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Linear Approximation in Frequency Domain01:26

Linear Approximation in Frequency Domain

218
Linear systems are characterized by two main properties: superposition and homogeneity. Superposition allows the response to multiple inputs to be the sum of the responses to each individual input. Homogeneity ensures that scaling an input by a scalar results in the response being scaled by the same scalar.
In contrast, nonlinear systems do not inherently possess these properties. However, for small deviations around an operating point, a nonlinear system can often be approximated as linear....
218
Sinusoidal Sources01:18

Sinusoidal Sources

839
Direct current (DC) refers to an electric current that flows in a single direction, maintaining a constant polarity. This is in contrast to alternating current (AC), which periodically changes its direction and magnitude. AC forms the backbone of modern electricity transmission and distribution systems due to its efficient long-distance transmission capabilities.
In homes, the power supplies use sinusoidal sources to provide electricity. These sources generate a voltage that varies sinusoidally...
839
Sampling Continuous Time Signal01:11

Sampling Continuous Time Signal

455
In signal processing, a continuous-time signal can be sampled using an impulse-train sampling technique, followed by the zero-order hold method. Impulse-train sampling involves the use of a periodic impulse train, which consists of a series of delta functions spaced at regular intervals determined by the sampling period. When a continuous-time signal is multiplied by this impulse train, it generates impulses with amplitudes corresponding to the signal's values at the sampling points.
In the...
455
Classification of Signals01:30

Classification of Signals

1.1K
In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...
1.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Robust Three-Microphone Speech Source Localization Using Randomized Singular Value Decomposition.

IEEE access : practical innovations, open solutions·2021
Same author

Smartphone-based single-channel speech enhancement application for hearing aids.

The Journal of the Acoustical Society of America·2021
Same author

CONVOLUTIONAL RECURRENT NEURAL NETWORK BASED DIRECTION OF ARRIVAL ESTIMATION METHOD USING TWO MICROPHONES FOR HEARING STUDIES.

IEEE International Workshop on Machine Learning for Signal Processing : [proceedings]. IEEE International Workshop on Machine Learning for Signal Processing·2021
Same author

Real-time single-channel deep neural network-based speech enhancement on edge devices.

Interspeech·2021
Same author

Real-time dual-channel speech enhancement by VAD assisted MVDR beamformer for hearing aid applications using smartphone.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2020
Same author

Automated machine learning based speech classification for hearing aid applications and its real-time implementation on smartphone.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2020
Same journal

A Multi-Head Attention Transformer Model for Wearable in Situ Fall Detection.

IEEE access : practical innovations, open solutions·2026
Same journal

Validating Single-Camera Pose Estimation Against Multi-Camera Motion Capture for Accessible Biomechanical Assessment.

IEEE access : practical innovations, open solutions·2026
Same journal

Learning to Diagnose Privately: DP-Powered LLMs for Radiology Report Classification.

IEEE access : practical innovations, open solutions·2026
Same journal

Radio-Frequency Toroid Susceptometry of Magnetic Nanoparticles: What Goes Around Comes Around.

IEEE access : practical innovations, open solutions·2026
Same journal

Cross-Architecture Knowledge Distillation for Histopathological Image Analysis.

IEEE access : practical innovations, open solutions·2026
Same journal

Mislabel Identification Using Transfer Learning-Based Ensemble Method.

IEEE access : practical innovations, open solutions·2026
See all related articles

Related Experiment Video

Updated: Nov 5, 2025

Concurrent EEG and Functional MRI Recording and Integration Analysis for Dynamic Cortical Activity Imaging
11:28

Concurrent EEG and Functional MRI Recording and Integration Analysis for Dynamic Cortical Activity Imaging

Published on: June 30, 2018

11.9K

Spectral Flux-Based Convolutional Neural Network Architecture for Speech Source Localization and Its Real-Time

Yiya Hao1, Abdullah Küçük1, Anshuman Ganguly1

  • 1Department of Electrical and Computer Engineering, The University of Texas at Dallas, Richardson, TX 75080, USA.

IEEE Access : Practical Innovations, Open Solutions
|May 13, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a real-time convolutional neural network (CNN) algorithm for robust speech source localization (SSL) in noisy environments. The novel method achieves high accuracy and low latency, improving audio processing for smart devices.

Keywords:
Speech source localization (SSL)beamforming (BF)convolutional neural networks (CNN)direction of arrival (DOA)hearing improvement (HI)real-time implementation

More Related Videos

Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention
04:32

Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention

Published on: December 20, 2024

581
Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

736

Related Experiment Videos

Last Updated: Nov 5, 2025

Concurrent EEG and Functional MRI Recording and Integration Analysis for Dynamic Cortical Activity Imaging
11:28

Concurrent EEG and Functional MRI Recording and Integration Analysis for Dynamic Cortical Activity Imaging

Published on: June 30, 2018

11.9K
Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention
04:32

Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention

Published on: December 20, 2024

581
Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

736

Area of Science:

  • Acoustics and Signal Processing
  • Artificial Intelligence and Machine Learning

Background:

  • Speech source localization (SSL) is crucial for audio processing but challenging in realistic noisy and reverberant conditions.
  • Existing SSL algorithms often struggle with performance degradation in complex acoustic environments.

Purpose of the Study:

  • To develop a real-time, robust CNN-based SSL algorithm capable of handling realistic background acoustic conditions.
  • To evaluate the algorithm's performance on a prototype platform for practical applications.

Main Methods:

  • Utilized a convolutional neural network (CNN) trained with features derived from the imaginary-real coefficients of the short-time Fourier transform (STFT) and Spectral Flux (SF).
  • Employed delay-and-sum (DAS) beamforming as part of the input feature extraction process.
  • Trained the CNN model using diverse noisy speech recordings and tested on unseen acoustic environments.

Main Results:

  • The proposed CNN-SSL algorithm demonstrated significant improvements over five previously published methods under various noisy conditions.
  • Achieved high accuracy (89.68% at 5dB SNR under Babble noise) with low latency (21 ms per frame).
  • Successfully implemented and tested for real-time operation on a Raspberry Pi prototype.

Conclusions:

  • The integration of Spectral Flux (SF) with beamforming enhances the CNN's ability to learn temporal variations in speech spectra, leading to improved SSL performance.
  • The developed algorithm offers a robust and efficient solution for real-time SSL, suitable for portable, battery-operated devices.
  • This work has significant implications for enhancing audio processing in smart loudspeakers and hearing improvement devices.