Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Perception of Sound Waves01:01

Perception of Sound Waves

5.3K
The human ear is not equally sensitive to all frequencies in the audible range. It may perceive sound waves with the same pressure but different frequencies as having different loudness. Moreover, the perception of sound waves depends on the health of an individual's ears, which decays with age. The health of one's ears may also be affected by regular exposure to loud noises.
The pitch of a sound depends on the frequency and the pressure amplitude of the source. Two sounds of the same...
5.3K
Sound Waves01:01

Sound Waves

11.7K
Sound waves can be thought of as fluctuations in the pressure of a medium through which they propagate. Since the pressure also makes the medium's particles vibrate along its direction of motion, the waves can be modeled as the displacement of the medium's particles from their mean position.
Sound waves are longitudinal in most fluids because fluids cannot sustain any lateral pressure. In solids, however, shear forces help in propagating the disturbance in the lateral direction as well....
11.7K
Sound Waves: Interference00:53

Sound Waves: Interference

4.4K
Sound waves can be modeled either as longitudinal waves, wherein the molecules of the medium oscillate around an equilibrium position, or as pressure waves. When two identical waves from the same source superimpose on each other, the combination of two crests or two troughs results in amplitude reinforcement known as constructive interference. If two identical waves, that are initially in phase, become out of phase because of different path lengths, the combination of crests with troughs...
4.4K
Sound as Pressure Waves01:17

Sound as Pressure Waves

4.1K
Sound waves, which are longitudinal waves, can be modeled as the displacement amplitude varying as a function of the spatial and temporal coordinates. As a column of the medium is displaced, its successive columns are also displaced. As the successive displacements differ relatively, a pressure difference with the surrounding pressure is created. The gauge pressure varies across the medium.
The pressure fluctuation depends on the difference in displacements between the successive points in the...
4.1K
Sound Waves: Resonance01:14

Sound Waves: Resonance

3.1K
Resonance is produced depending on the boundary conditions imposed on a wave. Resonance can be produced in a string under tension with symmetrical boundary conditions (i.e., has a node at each end). A node is defined as a fixed point where the string does not move. The symmetrical boundary conditions result in some frequencies resonating and producing standing waves, while other frequencies interfere destructively. Sound waves can resonate in a hollow tube, and the frequencies of the sound...
3.1K
Aliasing01:18

Aliasing

449
Accurate signal sampling and reconstruction are crucial in various signal-processing applications. A time-domain signal's spectrum can be revealed using its Fourier transform. When this signal is sampled at a specific frequency, it results in multiple scaled replicas of the original spectrum in the frequency domain. The spacing of these replicas is determined by the sampling frequency.
If the sampling frequency is below the Nyquist rate, these replicas overlap, preventing the original...
449

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Fabricating a Cu-MBG-incorporated polyurethane foam with antibacterial properties and bioactivity for diabetic wound healing.

iScience·2026
Same author

Multicenter fine annotated surgical video dataset for minimally invasive glaucoma surgery.

Scientific data·2026
Same author

Phase separation-based HTS identifies cobimetinib as a YAP-TEAD inhibitor that suppresses hyperactivated YAP-induced cancer progression.

Science translational medicine·2026
Same author

Intraductal papillary neoplasm of the bile duct: a clinical analysis of 62 cases and evaluation of prognostic factors.

BMC surgery·2026
Same author

miR-4652-3p suppresses glutamine metabolism induced by the inflammatory microenvironment in non-small cell lung cancer by regulating MYC/SLC1A5.

Hereditas·2026
Same author

High-Efficiency Blue TADF Palladium(II) Complexes with Ligand-to-Ligand Charge Transfer Excited State.

Angewandte Chemie (International ed. in English)·2026
Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: Dec 13, 2025

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder
09:13

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Published on: April 22, 2015

16.9K

Generating Visually Aligned Sound from Videos.

Peihao Chen, Yang Zhang, Mingkui Tan

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |August 4, 2020
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces REGNET, a novel framework for generating realistic sounds from videos. REGNET improves audio-visual alignment by using an audio forwarding regularizer, achieving a 68.12% success rate in fooling human evaluators.

    More Related Videos

    Profiling Maternal Behavior Responses During Whole-Brain Imaging
    07:12

    Profiling Maternal Behavior Responses During Whole-Brain Imaging

    Published on: January 24, 2025

    1.2K
    fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals
    11:15

    fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

    Published on: May 23, 2017

    7.5K

    Related Experiment Videos

    Last Updated: Dec 13, 2025

    Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder
    09:13

    Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

    Published on: April 22, 2015

    16.9K
    Profiling Maternal Behavior Responses During Whole-Brain Imaging
    07:12

    Profiling Maternal Behavior Responses During Whole-Brain Imaging

    Published on: January 24, 2025

    1.2K
    fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals
    11:15

    fMRI Mapping of Brain Activity Associated with the Vocal Production of Consonant and Dissonant Intervals

    Published on: May 23, 2017

    7.5K

    Area of Science:

    • Computer Vision
    • Audio Signal Processing
    • Machine Learning

    Background:

    • Generating realistic audio synchronized with video content is a complex challenge.
    • Existing models struggle with sounds originating outside the camera's view, leading to misaligned audio-visual mappings.
    • This can result in generated sounds that are temporally or contextually inconsistent with the visual input.

    Purpose of the Study:

    • To develop a robust framework for generating temporally and content-wise aligned sound from natural videos.
    • To address the challenge of irrelevant sounds from outside the video frame interfering with sound generation.
    • To improve the accuracy and realism of synthesized audio in response to visual stimuli.

    Main Methods:

    • Proposed a framework named REGNET for video-to-sound generation.
    • Extracted appearance and motion features from video frames to identify sound-emitting objects.
    • Introduced an innovative audio forwarding regularizer using real sound as input for stronger training supervision.
    • Removed the regularizer during testing for purely visual feature-based sound production.

    Main Results:

    • REGNET significantly enhances both temporal and content-wise alignment between generated sound and video.
    • Extensive evaluations demonstrated the effectiveness of the proposed audio forwarding regularizer.
    • The generated sound achieved a remarkable 68.12% success rate in fooling human evaluators.

    Conclusions:

    • REGNET effectively tackles the challenge of generating accurate and aligned sound from video.
    • The audio forwarding regularizer successfully prevents the model from learning incorrect visual-sound mappings.
    • The framework demonstrates a significant advancement in audio-visual synthesis, with potential applications in multimedia and virtual reality.