Search research articles

关于 JoVE

概览领导团队博客 JoVE 帮助中心

作者

出版流程编辑委员会范围与政策同行评审常见问题投稿

图书馆员

用户评价订阅访问资源图书馆顾问委员会常见问题

研究

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments 存档

教育

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual 教师资源中心教师网站

使用条款与条件

相关概念视频

Perception of Sound Waves

Perception of Sound Waves

The human ear is not equally sensitive to all frequencies in the audible range. It may perceive sound waves with the same pressure but different frequencies as having different loudness. Moreover, the perception of sound waves depends on the health of an individual's ears, which decays with age. The health of one's ears may also be affected by regular exposure to loud noises.
The pitch of a sound depends on the frequency and the pressure amplitude of the source. Two sounds of the same...

Perceiving Loudness, Pitch, and Location

Perceiving Loudness, Pitch, and Location

The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...

Labeling Emotion

Labeling Emotion

Emotional labeling is a cognitive process that involves identifying and naming one's emotions, such as anger, fear, happiness, or sadness. It allows individuals to recognize and express their internal emotional states, a critical aspect of emotional regulation and communication. Labeling emotions requires more than mere recognition; it also involves drawing upon memory and contextual cues to understand the current situation and apply a corresponding emotional label. For instance, feeling...

Non-Verbal Cues

Non-Verbal Cues

Non-verbal communication extends beyond gestures and facial expressions to include vocal elements known as paralanguage. Paralanguage consists of non-verbal vocal cues such as pitch, loudness, speech rate, pauses, and non-verbal vocalizations like laughter, sighs, and moans. These elements not only accompany speech but also provide critical emotional and contextual information.The Role of Paralanguage in CommunicationParalanguage adds depth to spoken language by conveying emotions and...

您也可能阅读

相关文章

通过共同作者、期刊和引用图与本文相关的文章。

排序

Same author

Remote Assessment of Parkinson Disease Using Deep Learning on Structured Mouse-Trace Data From Suspected Cases: Machine-Learning Pilot Feasibility Study.

JMIR formative research·2026

Same author

Correlates of Fitness Tracker Ownership and Use in Cancer Survivors: Cross-Sectional Survey.

JMIR cancer·2026

Same author

Aiding Large Language Models Using Clinical Scoresheets for Neurobehavioral Diagnostic Classification From Text: Algorithm Development and Validation.

JMIR AI·2025

Same author

mHealth technologies in research studying cardiovascular health in cancer: A systematic review.

PLOS digital health·2025

Same author

Associations Between Social Determinants of Health and Adherence in Mobile-Based Ecological Momentary Assessment: Scoping Review.

Journal of medical Internet research·2025

Same author

Personalization of AI Using Personal Foundation Models Can Lead to More Precise Digital Therapeutics.

JMIR AI·2025

Same journal

Rodent Social Behavior Recognition Using a Global Context-Aware Vision Transformer Network.

AI (Basel, Switzerland)·2026

Same journal

Artificial Intelligence at the Intersection of Chemistry and Materials Science.

AI (Basel, Switzerland)·2026

Same journal

Monitoring Substance Use with Fitbit Biosignals: A Case Study on Training Deep Learning Models Using Ecological Momentary Assessments and Passive Sensing.

AI (Basel, Switzerland)·2025

Same journal

Can Artificial Intelligence Aid Diagnosis by Teleguided Point-of-Care Ultrasound? A Pilot Study for Evaluating a Novel Computer Algorithm for COVID-19 Diagnosis Using Lung Ultrasound.

AI (Basel, Switzerland)·2023

Same journal

Can Sequential Images from the Same Object Be Used for Training Machine Learning Models? A Case Study for Detecting Liver Disease by Ultrasound Radiomics.

AI (Basel, Switzerland)·2022

查看所有相关文章

Search research articles

首页
基于音频的情绪识别使用自主监督的学习在一个工程特征空间.

首页
基于音频的情绪识别使用自主监督的学习在一个工程特征空间.

相关实验视频

Conscious and Non-conscious Representations of Emotional Faces in Asperger's Syndrome

Conscious and Non-conscious Representations of Emotional Faces in Asperger's Syndrome

Published on: July 31, 2016

基于音频的情绪识别使用自主监督的学习在一个工程特征空间.

Peranut Nimitsurachat¹, Peter Washington²

¹Institute for Computational and Mathematical Engineering (ICME), Stanford University, Stanford, CA 94305, USA.

AI (Basel, Switzerland)

|May 8, 2024

在PubMed 上查看摘要

概括

此摘要是机器生成的。

自主监督学习 (SSL) 增强了基于音频的情感识别模型,特别是当标记数据稀缺时. 这种方法通过对声学特征进行预训练来提高性能,对于容易分类的情绪证明最有效.

关键词:

情绪的分类情绪的分类情感识别情感识别情感识别自主监督学习学习转移学习转移学习

更多相关视频

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

相关实验视频

Conscious and Non-conscious Representations of Emotional Faces in Asperger's Syndrome

Conscious and Non-conscious Representations of Emotional Faces in Asperger's Syndrome

Published on: July 31, 2016

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

科学领域:

情感计算是一种情感计算.
机器学习是机器学习.
语音处理语音处理

背景情况:

从音频中识别情绪对于各种领域的交互式系统至关重要.
一个关键的挑战是,对于高性能模型来说,标记训练数据的可用性有限.
自主监督学习 (SSL) 提供了一个解决方案,通过从数据属性中学习,而无需使用广泛的标签.

研究的目的:

为了研究自我监督学习的有效性,为基于音频的情感识别进行预训练.
将SSL应用于CMU-MOSEI数据集中的编码声学特征.
与基线深度学习模型相比,评估SSL对模型性能的影响.

主要方法:

应用自主监督学习预训,使用CMU-MOSEI数据集中的编码声数据 (74个特征).
预先训练模型来预测面具声学数据时间.
使用一小组注释数据微调预训练模型.
使用平均绝对误差 (MAE) 和四类准确度评估性能,与基线进行比较.

主要成果:

自主监督学习在所有评估指标 (MAE,准确性) 中始终改善了模型性能.
当用于微调的注释数据数量很少时,性能增长最为显著.

对于快乐,悲伤和愤怒等容易分类的情绪,SSL表现出了显著的改进.

即使应用于嵌入式功能表示,SSL也提高了性能,而不仅仅是原始音频数据.

结论:

自主监督学习对于基于音频的情绪识别是非常有益的,特别是在低数据模式下.
通过利用未标记的数据,SSL有效地增强了情感计算模型.
该研究验证了SSL在编码声学特征上的实用性,为改进情绪识别系统提供了实用方法.