Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Classification of Signals01:30

Classification of Signals

574
In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...
574
Force Classification01:22

Force Classification

1.3K
Forces play a crucial role in the study of physics and engineering. They are essential in describing the motion, behavior, and equilibrium of objects in the physical world. Forces can be classified based on their origin, type, and direction of action.
Contact and non-contact forces are two of the most widely used categories of forces. As the name suggests, contact forces require physical contact between two objects to act upon each other. Examples of contact forces include frictional,...
1.3K
Methods of Classification and Identification01:28

Methods of Classification and Identification

63
Bacterial identification relies on a diverse array of techniques to classify and understand microorganisms, each tailored to uncover specific characteristics. Traditional morphological approaches, while still valuable, are limited for closely related or structurally simple organisms. Modern methods integrate biochemical, serological, genetic, and advanced molecular tools to achieve greater accuracy.Morphological and Biochemical TechniquesMorphological characteristics, such as cell shape and...
63
Classification of Systems-I01:26

Classification of Systems-I

236
Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:
236

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Artificial Intelligence in Healthcare and Public Health: Emerging Applications, Clinical Integration and Future Directions.

Bioengineering (Basel, Switzerland)·2026
Same author

Focused ultrasound thalamotomy improves voice tremor in essential tremor: objective insight from artificial intelligence.

Scientific reports·2026
Same author

L-Dopa Comparably Improves Gait and Limb Movements in Parkinson's Disease: A Wearable Sensor Analysis.

Biomedicines·2025
Same author

Effects of deep brain stimulation of the subthalamic nucleus on patients with Parkinson's disease: a machine-learning voice analysis.

Frontiers in neurology·2023
Same author

Artificial Intelligence-Based Voice Assessment of Patients with Parkinson's Disease Off and On Treatment: Machine vs. Deep-Learning Comparison.

Sensors (Basel, Switzerland)·2023
Same author

The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning.

Sensors (Basel, Switzerland)·2022
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Aug 2, 2025

Asthma Detection Research Based on Voice Signal Processing and Machine Learning
04:04

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Published on: July 22, 2025

144

High-Level CNN and Machine Learning Methods for Speaker Recognition.

Giovanni Costantini1, Valerio Cesarini1, Emanuele Brenna1

  • 1Department of Electronic Engineering, University of Rome Tor Vergata, 00133 Roma, Italy.

Sensors (Basel, Switzerland)
|April 13, 2023
PubMed
Summary
This summary is machine-generated.

A custom CNN achieved 90.15% accuracy in speaker recognition tasks, outperforming other methods on the DEMoS dataset. This deep learning approach excels in identifying speakers even with emotional variations.

Keywords:
AlexNetCNNF0Machine LearningNaïve Bayesaudiospeaker recognition

More Related Videos

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.6K
Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

482

Related Experiment Videos

Last Updated: Aug 2, 2025

Asthma Detection Research Based on Voice Signal Processing and Machine Learning
04:04

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Published on: July 22, 2025

144
Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.6K
Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody
09:09

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

482

Area of Science:

  • Artificial Intelligence
  • Machine Learning
  • Speech Processing

Background:

  • Speaker Recognition (SR) is crucial in AI sound analysis, utilizing diverse methodologies like Deep Learning (DL) and traditional Machine Learning (ML).
  • Evaluating these methods on diverse datasets with emotional variations is essential for real-world applicability.

Purpose of the Study:

  • To compare the performance of a custom Convolutional Neural Network (CNN) against pre-trained networks and a Naïve Bayes model for speaker recognition.
  • To investigate the effectiveness of different input representations (spectrograms, MFCCs) and feature selection techniques.

Main Methods:

  • A custom CNN and pre-trained networks (e.g., AlexNet) were trained using spectrograms and Mel-frequency cepstral coefficients (MFCCs).
  • A traditional ML approach employed acoustic feature extraction, selection, and a Naïve Bayes classifier.
  • Experiments were conducted on the DEMoS dataset, comprising 8869 audio files from 58 speakers with varying emotional states.

Main Results:

  • The custom CNN achieved the highest accuracy (90.15% on grayscale spectrograms).
  • AlexNet demonstrated comparable performance (89.28% on spectrograms).
  • The Naïve Bayes classifier offered 87.09% accuracy and high AUC (0.985) with faster training and better interpretability. Key features identified include F0, MFCC, and voicing-related parameters.

Conclusions:

  • Custom CNNs, particularly when trained on grayscale spectrograms, show superior accuracy for speaker recognition in emotionally diverse scenarios.
  • Traditional ML methods like Naïve Bayes provide a viable, interpretable, and computationally efficient alternative.
  • The DEMoS dataset's characteristics, including emotional content and sample size, facilitate robust model generalization for speaker recognition.