Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Perception of Sound Waves

Perception of Sound Waves

The human ear is not equally sensitive to all frequencies in the audible range. It may perceive sound waves with the same pressure but different frequencies as having different loudness. Moreover, the perception of sound waves depends on the health of an individual's ears, which decays with age. The health of one's ears may also be affected by regular exposure to loud noises.
The pitch of a sound depends on the frequency and the pressure amplitude of the source. Two sounds of the same...

Sound Waves

Sound Waves

Sound waves can be thought of as fluctuations in the pressure of a medium through which they propagate. Since the pressure also makes the medium's particles vibrate along its direction of motion, the waves can be modeled as the displacement of the medium's particles from their mean position.
Sound waves are longitudinal in most fluids because fluids cannot sustain any lateral pressure. In solids, however, shear forces help in propagating the disturbance in the lateral direction as well....

Sound Waves: Interference

Sound Waves: Interference

Sound waves can be modeled either as longitudinal waves, wherein the molecules of the medium oscillate around an equilibrium position, or as pressure waves. When two identical waves from the same source superimpose on each other, the combination of two crests or two troughs results in amplitude reinforcement known as constructive interference. If two identical waves, that are initially in phase, become out of phase because of different path lengths, the combination of crests with troughs...

Sound as Pressure Waves

Sound as Pressure Waves

Sound waves, which are longitudinal waves, can be modeled as the displacement amplitude varying as a function of the spatial and temporal coordinates. As a column of the medium is displaced, its successive columns are also displaced. As the successive displacements differ relatively, a pressure difference with the surrounding pressure is created. The gauge pressure varies across the medium.
The pressure fluctuation depends on the difference in displacements between the successive points in the...

Sound Waves: Resonance

Sound Waves: Resonance

Resonance is produced depending on the boundary conditions imposed on a wave. Resonance can be produced in a string under tension with symmetrical boundary conditions (i.e., has a node at each end). A node is defined as a fixed point where the string does not move. The symmetrical boundary conditions result in some frequencies resonating and producing standing waves, while other frequencies interfere destructively. Sound waves can resonate in a hollow tube, and the frequencies of the sound...

Aliasing

Aliasing

Accurate signal sampling and reconstruction are crucial in various signal-processing applications. A time-domain signal's spectrum can be revealed using its Fourier transform. When this signal is sampled at a specific frequency, it results in multiple scaled replicas of the original spectrum in the frequency domain. The spacing of these replicas is determined by the sampling frequency.
If the sampling frequency is below the Nyquist rate, these replicas overlap, preventing the original...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Fabricating a Cu-MBG-incorporated polyurethane foam with antibacterial properties and bioactivity for diabetic wound healing.

iScience·2026

Same author

Multicenter fine annotated surgical video dataset for minimally invasive glaucoma surgery.

Scientific data·2026

Same author

Phase separation-based HTS identifies cobimetinib as a YAP-TEAD inhibitor that suppresses hyperactivated YAP-induced cancer progression.

Science translational medicine·2026

Same author

Intraductal papillary neoplasm of the bile duct: a clinical analysis of 62 cases and evaluation of prognostic factors.

BMC surgery·2026

Same author

miR-4652-3p suppresses glutamine metabolism induced by the inflammatory microenvironment in non-small cell lung cancer by regulating MYC/SLC1A5.

Hereditas·2026

Same author

High-Efficiency Blue TADF Palladium(II) Complexes with Ligand-to-Ligand Charge Transfer Excited State.

Angewandte Chemie (International ed. in English)·2026

Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 13, 2025

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Published on: April 22, 2015

Generating Visually Aligned Sound from Videos.

Peihao Chen, Yang Zhang, Mingkui Tan

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society

|August 4, 2020

Summary

This summary is machine-generated.

This study introduces REGNET, a novel framework for generating realistic sounds from videos. REGNET improves audio-visual alignment by using an audio forwarding regularizer, achieving a 68.12% success rate in fooling human evaluators.

More Related Videos

Profiling Maternal Behavior Responses During Whole-Brain Imaging

Profiling Maternal Behavior Responses During Whole-Brain Imaging

Published on: January 24, 2025

fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

Published on: May 23, 2017

Related Experiment Videos

Last Updated: Dec 13, 2025

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Published on: April 22, 2015

Profiling Maternal Behavior Responses During Whole-Brain Imaging

Profiling Maternal Behavior Responses During Whole-Brain Imaging

Published on: January 24, 2025

fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

Published on: May 23, 2017

Area of Science:

Computer Vision
Audio Signal Processing
Machine Learning

Background:

Generating realistic audio synchronized with video content is a complex challenge.
Existing models struggle with sounds originating outside the camera's view, leading to misaligned audio-visual mappings.
This can result in generated sounds that are temporally or contextually inconsistent with the visual input.

Purpose of the Study:

To develop a robust framework for generating temporally and content-wise aligned sound from natural videos.
To address the challenge of irrelevant sounds from outside the video frame interfering with sound generation.
To improve the accuracy and realism of synthesized audio in response to visual stimuli.

Main Methods:

Proposed a framework named REGNET for video-to-sound generation.
Extracted appearance and motion features from video frames to identify sound-emitting objects.
Introduced an innovative audio forwarding regularizer using real sound as input for stronger training supervision.
Removed the regularizer during testing for purely visual feature-based sound production.

Main Results:

REGNET significantly enhances both temporal and content-wise alignment between generated sound and video.
Extensive evaluations demonstrated the effectiveness of the proposed audio forwarding regularizer.
The generated sound achieved a remarkable 68.12% success rate in fooling human evaluators.

Conclusions:

REGNET effectively tackles the challenge of generating accurate and aligned sound from video.
The audio forwarding regularizer successfully prevents the model from learning incorrect visual-sound mappings.
The framework demonstrates a significant advancement in audio-visual synthesis, with potential applications in multimedia and virtual reality.