Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Physiology of Emotion01:20

Physiology of Emotion

4.5K
The physiology of emotions is a multifaceted process involving the autonomic nervous system, brain structures, hormones, and neurotransmitters. This intricate interplay dictates how emotions manifest in the body and influence behavior.
Autonomic Nervous System
The autonomic nervous system (ANS) plays a critical role in emotional responses by regulating involuntary physiological functions. It consists of two main components: the sympathetic and parasympathetic systems. The sympathetic system...
4.5K
Emotional Expression01:26

Emotional Expression

1.3K
Emotional expression encompasses how individuals convey their emotions through verbal communication and non-verbal cues. These non-verbal actions include facial expressions, body language, and physical gestures, such as frowning or smiling. Among these, facial expressions play a crucial role in emotional expression and are understood universally, indicating a biological basis for how humans communicate emotions.
Universal Facial Expressions
Psychologist Paul Ekman identified seven basic...
1.3K
Labeling Emotion01:20

Labeling Emotion

1.0K
Emotional labeling is a cognitive process that involves identifying and naming one's emotions, such as anger, fear, happiness, or sadness. It allows individuals to recognize and express their internal emotional states, a critical aspect of emotional regulation and communication. Labeling emotions requires more than mere recognition; it also involves drawing upon memory and contextual cues to understand the current situation and apply a corresponding emotional label. For instance, feeling...
1.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Explainable and Interpretable AI for Voice and Speech Analysis in Clinical Care: Systematic Review.

Journal of medical Internet research·2026
Same author

MADSurv: An Uncertainty-Aware Framework for Multimodal Cancer Survival Analysis.

ACM-BCB ... ... : the ... ACM Conference on Bioinformatics, Computational Biology and Biomedicine. ACM Conference on Bioinformatics, Computational Biology and Biomedicine·2026
Same author

Predicting time to clearance of sport-related concussions using machine learning.

Digital health·2026
Same author

Explainable and Interpretable AI for Voice and Speech Analysis in Clinical Care: A Systematic Review.

Journal of medical Internet research·2026
Same author

Utilization of machine learning to identify lower extremity biomechanical predictors of rupture in a validated cadaveric model of ACL injury.

Scientific reports·2026
Same author

Integrating Smart Health Solutions in Disaster Preparedness Strategies.

Studies in health technology and informatics·2026
Same journal

Turbulent flow in a vortex separator with a directed pipe inlet.

Scientific reports·2026
Same journal

Systematic characteristic evaluation of clay-based cementitious material derived from calcium carbide residue and waste tile powder.

Scientific reports·2026
Same journal

Retraction Note: Improvement of a rapid diagnostic application of monoclonal antibodies against avian influenza H7 subtype virus using Europium nanoparticles.

Scientific reports·2026
Same journal

Applying large language models to spam detection in the Kazakh low-resource language setting.

Scientific reports·2026
Same journal

An open-source 3D printing system enabling in-situ freeze-thaw processing of hydrogels.

Scientific reports·2026
Same journal

An enhanced EfficientNet framework for automated waste classification using cosine annealing and label smoothing.

Scientific reports·2026
See all related articles

Related Experiment Video

Updated: May 2, 2026

Exploring the Use of Isolated Expressions and Film Clips to Evaluate Emotion Recognition by People with Traumatic Brain Injury
05:51

Exploring the Use of Isolated Expressions and Film Clips to Evaluate Emotion Recognition by People with Traumatic Brain Injury

Published on: May 15, 2016

8.9K

A multi-dilated convolution network for speech emotion recognition.

Samaneh Madanian1, Olayinka Adeleye2, John Michael Templeton3

  • 1Department of Data Science and Artificial Intelligence, Auckland University of Technology, Auckland, New Zealand. sam.madanian@aut.ac.nz.

Scientific Reports
|March 11, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces a novel speech emotion recognition (SER) model using deep learning on spectrograms. The model enhances feature extraction and achieves improved accuracy on benchmark datasets.

Keywords:
Convolution neural networkDeep learningEmotion recognitionLoss layerSpectrogramSpeech emotion recognition

More Related Videos

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

450
Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.4K

Related Experiment Videos

Last Updated: May 2, 2026

Exploring the Use of Isolated Expressions and Film Clips to Evaluate Emotion Recognition by People with Traumatic Brain Injury
05:51

Exploring the Use of Isolated Expressions and Film Clips to Evaluate Emotion Recognition by People with Traumatic Brain Injury

Published on: May 15, 2016

8.9K
Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

450
Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
05:48

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

1.4K

Area of Science:

  • Artificial Intelligence
  • Affective Computing
  • Speech Processing

Background:

  • Speech emotion recognition (SER) is crucial for human-computer interaction.
  • Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) show promise for SER using speech spectrograms.
  • Existing methods face challenges in effectively learning deep patterns from spectrograms.

Purpose of the Study:

  • To propose a novel SER model leveraging utterance-level spectrogram analysis.
  • To enhance feature extraction by integrating Spatial Pyramid Pooling (SPP) and attention mechanisms.
  • To adapt techniques from face recognition, specifically ArcFace, for improved SER performance.

Main Methods:

  • Utilized Spatial Pyramid Pooling (SPP) to overcome CNN size constraints.
  • Extracted global and multi-local level feature vectors using SPP.
  • Implemented an attention model to weigh extracted feature vectors.
  • Applied the ArcFace layer, typically used in face recognition, to the SER task.

Main Results:

  • Achieved an unweighted accuracy of 67.9% on the IEMOCAP dataset.
  • Achieved an unweighted accuracy of 77.6% on the EMODB dataset.
  • Demonstrated improved SER performance through the proposed model architecture.

Conclusions:

  • The novel SER model effectively learns deep patterns from utterance-level spectrograms.
  • Integrating SPP, attention, and ArcFace enhances SER accuracy.
  • The proposed approach offers a promising direction for advancing SER technology.