Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Perceiving Loudness, Pitch, and Location01:21

Perceiving Loudness, Pitch, and Location

283
The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...
283
Sound as Pressure Waves01:17

Sound as Pressure Waves

2.5K
Sound waves, which are longitudinal waves, can be modeled as the displacement amplitude varying as a function of the spatial and temporal coordinates. As a column of the medium is displaced, its successive columns are also displaced. As the successive displacements differ relatively, a pressure difference with the surrounding pressure is created. The gauge pressure varies across the medium.
The pressure fluctuation depends on the difference in displacements between the successive points in the...
2.5K
Perception of Sound Waves01:01

Perception of Sound Waves

4.5K
The human ear is not equally sensitive to all frequencies in the audible range. It may perceive sound waves with the same pressure but different frequencies as having different loudness. Moreover, the perception of sound waves depends on the health of an individual's ears, which decays with age. The health of one's ears may also be affected by regular exposure to loud noises.
The pitch of a sound depends on the frequency and the pressure amplitude of the source. Two sounds of the same...
4.5K
Auditory Perception01:17

Auditory Perception

391
The auditory system is essential for sound perception, utilizing various critical structures. When sound waves enter the outer ear, they travel through the ear canal and cause the eardrum to vibrate. These vibrations are then transmitted to the middle ear, where three tiny bones – the malleus, incus, and stapes – amplify the sound. This amplification is crucial, as it ensures that the sound vibrations are strong enough to be conveyed to the inner ear. These vibrations then reach the...
391
Hearing01:31

Hearing

52.6K
When we hear a sound, our nervous system is detecting sound waves—pressure waves of mechanical energy traveling through a medium. The frequency of the wave is perceived as pitch, while the amplitude is perceived as loudness.
52.6K
Sound Waves01:01

Sound Waves

9.2K
Sound waves can be thought of as fluctuations in the pressure of a medium through which they propagate. Since the pressure also makes the medium's particles vibrate along its direction of motion, the waves can be modeled as the displacement of the medium's particles from their mean position.
Sound waves are longitudinal in most fluids because fluids cannot sustain any lateral pressure. In solids, however, shear forces help in propagating the disturbance in the lateral direction as well....
9.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Deconstructing a behavioral state: parallel neural integrators control distinct features of an aversive behavioral state in <i>C. elegans</i>.

bioRxiv : the preprint server for biology·2026
Same author

Different grazing intensities affect soil nitrogen cycling by altering microbial nitrogen metabolism in alpine wetlands.

iScience·2026
Same author

HDAC8-mediated CAPZB desuccinylation enhances cytoskeleton remodeling to promote idiopathic pulmonary fibrosis.

Communications biology·2026
Same author

Development of canine parvovirus-neutralizing monoclonal antibodies from natural host and their germline gene usage.

Applied microbiology and biotechnology·2026
Same author

Deep neural networks to register and annotate cells in moving and deforming nervous systems.

eLife·2026
Same author

Theoretical insights into adsorption behaviors and <sup>17</sup>O nuclear magnetic resonance investigations of water clusters over xylitol-decorated hexagonal boron nitride.

Journal of molecular graphics & modelling·2026
Same journal

MesoSplats: Texture Synthesis with Gaussian Splatting.

IEEE transactions on visualization and computer graphics·2026
Same journal

GLLA: A Unified Force-Directed Graph Layout Framework Supporting Local Adjustments.

IEEE transactions on visualization and computer graphics·2026
Same journal

Multi-Perception Crowd: Learning to combine entity and implicit perception for diverse crowd simulation.

IEEE transactions on visualization and computer graphics·2026
Same journal

Hiding in Plain Sight: Camouflaging Real-world Objects.

IEEE transactions on visualization and computer graphics·2026
Same journal

RTF2Mesh: Restricted Tangent Face Based Mesh Compression With Neural Displacement Fields.

IEEE transactions on visualization and computer graphics·2026
Same journal

Practical Occluder Generation for Mobile Games.

IEEE transactions on visualization and computer graphics·2026
See all related articles

Related Experiment Video

Updated: Jul 30, 2025

Author Spotlight: Deciphering the Cognitive and Neural Mechanisms of Gesture in Communication
07:18

Author Spotlight: Deciphering the Cognitive and Neural Mechanisms of Gesture in Communication

Published on: January 26, 2024

932

Audio2Gestures: Generating Diverse Gestures From Audio.

Jing Li, Di Kang, Wenjie Pei

    IEEE Transactions on Visualization and Computer Graphics
    |May 17, 2023
    PubMed
    Summary
    This summary is machine-generated.

    Generating realistic co-speech gestures from audio is challenging due to diverse human motion. This study introduces a novel method to model one-to-many audio-to-motion relationships, producing more varied and natural movements.

    More Related Videos

    Capturing Dynamic Finger Gesturing with High-resolution Surface Electromyography and Computer Vision
    08:15

    Capturing Dynamic Finger Gesturing with High-resolution Surface Electromyography and Computer Vision

    Published on: March 28, 2025

    672
    Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface
    11:54

    Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

    Published on: May 8, 2021

    4.5K

    Related Experiment Videos

    Last Updated: Jul 30, 2025

    Author Spotlight: Deciphering the Cognitive and Neural Mechanisms of Gesture in Communication
    07:18

    Author Spotlight: Deciphering the Cognitive and Neural Mechanisms of Gesture in Communication

    Published on: January 26, 2024

    932
    Capturing Dynamic Finger Gesturing with High-resolution Surface Electromyography and Computer Vision
    08:15

    Capturing Dynamic Finger Gesturing with High-resolution Surface Electromyography and Computer Vision

    Published on: March 28, 2025

    672
    Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface
    11:54

    Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

    Published on: May 8, 2021

    4.5K

    Area of Science:

    • Computer Vision
    • Artificial Intelligence
    • Human-Computer Interaction

    Background:

    • Generating co-speech gestures from audio is complex due to the inherent one-to-many relationship between speech and motion.
    • Traditional models often predict average motions, leading to less diverse and engaging gestures.

    Purpose of the Study:

    • To develop a novel approach for co-speech gesture generation that explicitly models the one-to-many audio-to-motion mapping.
    • To enhance the diversity and realism of generated gestures compared to existing methods.

    Main Methods:

    • Proposed a Variational Autoencoder (VAE) framework that splits cross-modal latent codes into shared (audio-correlated) and motion-specific (diverse) components.
    • Introduced specialized training losses, including relaxed motion loss, bicycle constraint, and diversity loss, to address training complexities.
    • Validated the approach on 3D and 2D motion datasets, incorporating structured losses (e.g., STFT) for improved motion evaluation.

    Main Results:

    • The proposed method significantly outperforms state-of-the-art approaches in generating more realistic and diverse co-speech gestures, both quantitatively and qualitatively.
    • Demonstrated compatibility with various backbones like RNNs, Transformers, and Discrete Cosine Transform (DCT) modeling.
    • Showcased the ability to generate motion sequences with user-specified motion clips.

    Conclusions:

    • The novel latent code splitting strategy effectively captures diverse gestural information independent of audio.
    • The developed training strategies and evaluation metrics lead to superior motion dynamics and nuanced details in generated gestures.
    • The method offers a flexible and powerful solution for realistic and controllable co-speech gesture generation.