Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Linear Approximation in Frequency Domain

Linear Approximation in Frequency Domain

Linear systems are characterized by two main properties: superposition and homogeneity. Superposition allows the response to multiple inputs to be the sum of the responses to each individual input. Homogeneity ensures that scaling an input by a scalar results in the response being scaled by the same scalar.
In contrast, nonlinear systems do not inherently possess these properties. However, for small deviations around an operating point, a nonlinear system can often be approximated as linear....

Sinusoidal Sources

Sinusoidal Sources

Direct current (DC) refers to an electric current that flows in a single direction, maintaining a constant polarity. This is in contrast to alternating current (AC), which periodically changes its direction and magnitude. AC forms the backbone of modern electricity transmission and distribution systems due to its efficient long-distance transmission capabilities.
In homes, the power supplies use sinusoidal sources to provide electricity. These sources generate a voltage that varies sinusoidally...

Sampling Continuous Time Signal

Sampling Continuous Time Signal

In signal processing, a continuous-time signal can be sampled using an impulse-train sampling technique, followed by the zero-order hold method. Impulse-train sampling involves the use of a periodic impulse train, which consists of a series of delta functions spaced at regular intervals determined by the sampling period. When a continuous-time signal is multiplied by this impulse train, it generates impulses with amplitudes corresponding to the signal's values at the sampling points.
In the...

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Robust Three-Microphone Speech Source Localization Using Randomized Singular Value Decomposition.

IEEE access : practical innovations, open solutions·2021

Same author

Smartphone-based single-channel speech enhancement application for hearing aids.

The Journal of the Acoustical Society of America·2021

Same author

CONVOLUTIONAL RECURRENT NEURAL NETWORK BASED DIRECTION OF ARRIVAL ESTIMATION METHOD USING TWO MICROPHONES FOR HEARING STUDIES.

IEEE International Workshop on Machine Learning for Signal Processing : [proceedings]. IEEE International Workshop on Machine Learning for Signal Processing·2021

Same author

Real-time single-channel deep neural network-based speech enhancement on edge devices.

Interspeech·2021

Same author

Real-time dual-channel speech enhancement by VAD assisted MVDR beamformer for hearing aid applications using smartphone.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2020

Same author

Automated machine learning based speech classification for hearing aid applications and its real-time implementation on smartphone.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2020

Same journal

A Multi-Head Attention Transformer Model for Wearable in Situ Fall Detection.

IEEE access : practical innovations, open solutions·2026

Same journal

Validating Single-Camera Pose Estimation Against Multi-Camera Motion Capture for Accessible Biomechanical Assessment.

IEEE access : practical innovations, open solutions·2026

Same journal

Learning to Diagnose Privately: DP-Powered LLMs for Radiology Report Classification.

IEEE access : practical innovations, open solutions·2026

Same journal

Radio-Frequency Toroid Susceptometry of Magnetic Nanoparticles: What Goes Around Comes Around.

IEEE access : practical innovations, open solutions·2026

Same journal

Cross-Architecture Knowledge Distillation for Histopathological Image Analysis.

IEEE access : practical innovations, open solutions·2026

Same journal

Mislabel Identification Using Transfer Learning-Based Ensemble Method.

IEEE access : practical innovations, open solutions·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Nov 5, 2025

Concurrent EEG and Functional MRI Recording and Integration Analysis for Dynamic Cortical Activity Imaging

Concurrent EEG and Functional MRI Recording and Integration Analysis for Dynamic Cortical Activity Imaging

Published on: June 30, 2018

Spectral Flux-Based Convolutional Neural Network Architecture for Speech Source Localization and Its Real-Time

Yiya Hao¹, Abdullah Küçük¹, Anshuman Ganguly¹

¹Department of Electrical and Computer Engineering, The University of Texas at Dallas, Richardson, TX 75080, USA.

IEEE Access : Practical Innovations, Open Solutions

|May 13, 2021

Summary

This summary is machine-generated.

This study introduces a real-time convolutional neural network (CNN) algorithm for robust speech source localization (SSL) in noisy environments. The novel method achieves high accuracy and low latency, improving audio processing for smart devices.

Keywords:

Speech source localization (SSL)beamforming (BF)convolutional neural networks (CNN)direction of arrival (DOA)hearing improvement (HI)real-time implementation

More Related Videos

Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention

Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention

Published on: December 20, 2024

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Related Experiment Videos

Last Updated: Nov 5, 2025

Concurrent EEG and Functional MRI Recording and Integration Analysis for Dynamic Cortical Activity Imaging

Concurrent EEG and Functional MRI Recording and Integration Analysis for Dynamic Cortical Activity Imaging

Published on: June 30, 2018

Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention

Sound Source Localization Testing in Single-sided Deafness Following Bone Conduction Intervention

Published on: December 20, 2024

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Area of Science:

Acoustics and Signal Processing
Artificial Intelligence and Machine Learning

Background:

Speech source localization (SSL) is crucial for audio processing but challenging in realistic noisy and reverberant conditions.
Existing SSL algorithms often struggle with performance degradation in complex acoustic environments.

Purpose of the Study:

To develop a real-time, robust CNN-based SSL algorithm capable of handling realistic background acoustic conditions.
To evaluate the algorithm's performance on a prototype platform for practical applications.

Main Methods:

Utilized a convolutional neural network (CNN) trained with features derived from the imaginary-real coefficients of the short-time Fourier transform (STFT) and Spectral Flux (SF).
Employed delay-and-sum (DAS) beamforming as part of the input feature extraction process.
Trained the CNN model using diverse noisy speech recordings and tested on unseen acoustic environments.

Main Results:

The proposed CNN-SSL algorithm demonstrated significant improvements over five previously published methods under various noisy conditions.
Achieved high accuracy (89.68% at 5dB SNR under Babble noise) with low latency (21 ms per frame).
Successfully implemented and tested for real-time operation on a Raspberry Pi prototype.

Conclusions:

The integration of Spectral Flux (SF) with beamforming enhances the CNN's ability to learn temporal variations in speech spectra, leading to improved SSL performance.
The developed algorithm offers a robust and efficient solution for real-time SSL, suitable for portable, battery-operated devices.
This work has significant implications for enhancing audio processing in smart loudspeakers and hearing improvement devices.