Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Larynx

Larynx

The human larynx, often referred to as the voice box, is an intricate organ located in the neck. It serves as a pathway for air to enter the lungs during respiration and is an essential component of voice production.
Anatomy of the Larynx
The larynx consists of various components, including cartilage, muscles, and vocal cords. Its structure includes three large unpaired cartilages—the thyroid, cricoid, and epiglottis—and three smaller paired cartilages—the arytenoids,...

Perceiving Loudness, Pitch, and Location

Perceiving Loudness, Pitch, and Location

The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...

Auditory Perception

Auditory Perception

The auditory system is essential for sound perception, utilizing various critical structures. When sound waves enter the outer ear, they travel through the ear canal and cause the eardrum to vibrate. These vibrations are then transmitted to the middle ear, where three tiny bones – the malleus, incus, and stapes – amplify the sound. This amplification is crucial, as it ensures that the sound vibrations are strong enough to be conveyed to the inner ear. These vibrations then reach the...

Physical Assessment of the Respiratory Tract IV: Auscultation

Physical Assessment of the Respiratory Tract IV: Auscultation

Auscultation is a crucial component of the physical assessment of the respiratory tract. It offers valuable insights into airflow through the bronchial tree and potential lung obstructions. This process involves careful listening to breath, voice, and adventitious sounds, which can reveal a wealth of information about a patient's respiratory health.
Breath Sounds
Breath sounds are categorized into vesicular, bronchovesicular, and bronchial.

Survey Safety

Survey Safety

Surveying near highways, rough terrain, or power lines involves significant risks. Working along highways is particularly dangerous and requires the use of warning signs and flagmen. It is safest to avoid working directly on roads and use offsets whenever possible. When highway work is unavoidable, it must follow all safety guidelines. Surveyors should wear bright clothing, such as orange reflective vests, to ensure visibility to motorists, coworkers, and hunters. In construction zones, wearing...

Hearing

Hearing

When we hear a sound, our nervous system is detecting sound waves—pressure waves of mechanical energy traveling through a medium. The frequency of the wave is perceived as pitch, while the amplitude is perceived as loudness.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A Real-World Benchmark for Sentinel-2 Multi-Image Super-Resolution.

Scientific data·2023

Same author

From Corrective to Predictive Maintenance-A Review of Maintenance Approaches for the Power Industry.

Sensors (Basel, Switzerland)·2023

Same author

Time Signature Detection: A Survey.

Sensors (Basel, Switzerland)·2021

Same author

Predictive Maintenance of Boiler Feed Water Pumps Using SCADA Data.

Sensors (Basel, Switzerland)·2020

Same author

Scalable Extraction of Big Macromolecular Data in Azure Data Lake Environment.

Molecules (Basel, Switzerland)·2019

Same author

The expanded invasive weed optimization metaheuristic for solving continuous and discrete optimization problems.

TheScientificWorldJournal·2014

Same journal

Research on a Regional Availability Evaluation Model for Road-Area High-Entropy Energy Based on Synergy Factors.

Entropy (Basel, Switzerland)·2026

Same journal

Atmospheric Turbulence Channel Modeling and Performance Analysis of a CO-ZP-OFDM Coherent Optical Communication System for UAV Air-to-Ground Scenarios.

Entropy (Basel, Switzerland)·2026

Same journal

Information Geometry and Asymptotic Theory for SMML Estimators.

Entropy (Basel, Switzerland)·2026

Same journal

Correlation Entropy and Power-Law Kinetics.

Entropy (Basel, Switzerland)·2026

Same journal

Research on the Contagion of Systemic Financial Risk Under the Impact of Climate Risks-From the Perspective of Complex Networks and Machine Learning.

Entropy (Basel, Switzerland)·2026

Same journal

The Statistical-Mechanical Meaning of the Wave Function of Quantum Mechanics.

Entropy (Basel, Switzerland)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 6, 2025

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Published on: July 22, 2025

Singing Voice Detection: A Survey.

Ramy Monir¹, Daniel Kostrzewa¹, Dariusz Mrozek¹

¹Department of Applied Informatics, Silesian University of Technology, 44-100 Gliwice, Poland.

Entropy (Basel, Switzerland)

|January 21, 2022

Summary

This summary is machine-generated.

This study surveys singing voice detection techniques, comparing classical and advanced methods like convolutional LSTM and GRU-RNN. State-of-the-art algorithms show impressive results on public datasets for vocal detection tasks.

Keywords:

Mel-frequency cepstrum coefficients datasets deep learning models hidden Markov models music information retrieval perceptual linear prediction short-time Fourier transform singing voice detection support vector machines vocal detection

More Related Videos

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

Published on: May 23, 2017

Related Experiment Videos

Last Updated: Oct 6, 2025

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Asthma Detection Research Based on Voice Signal Processing and Machine Learning

Published on: July 22, 2025

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

Published on: May 23, 2017

Area of Science:

Music Information Retrieval
Signal Processing
Machine Learning

Background:

Singing voice detection (SVD) is essential for music analysis tasks.
Accurate SVD improves downstream applications like lyric alignment and melody extraction.

Purpose of the Study:

To provide a comprehensive survey of singing voice detection techniques.
To investigate both traditional and state-of-the-art SVD algorithms.
To compare the performance of different SVD methods.

Main Methods:

Review of classical and modern SVD algorithms.
Focus on deep learning models like Convolutional Long Short-Term Memory (ConvLSTM) and Gated Recurrent Unit Recurrent Neural Networks (GRU-RNN).
Comparative analysis using established datasets (Jamendo, RWC).

Main Results:

Long-term recurrent convolutional networks achieve high performance on public datasets.
Deep learning approaches demonstrate significant advancements in SVD accuracy.
Dataset-specific performance variations are observed across different methods.

Conclusions:

SVD is a critical component in music information retrieval systems.
Advanced deep learning models offer superior performance for singing voice detection.
Further research can refine SVD for diverse musical contexts and datasets.