Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Auditory Pathway01:15

Auditory Pathway

8.3K
Auditory pathways constitute the complex neural circuits responsible for transmitting and interpreting auditory information from the peripheral auditory system to the brain. Sound waves are initially captured by the outer ear, funneled through the ear canal, and reach the tympanic membrane (eardrum). These vibrations are transmitted via the middle ear's ossicles to the inner ear's cochlea.
When viewed cross-sectionally, the cochlea reveals the scala vestibuli and scala tympani flanking...
8.3K
Auditory Perception01:17

Auditory Perception

1.4K
The auditory system is essential for sound perception, utilizing various critical structures. When sound waves enter the outer ear, they travel through the ear canal and cause the eardrum to vibrate. These vibrations are then transmitted to the middle ear, where three tiny bones – the malleus, incus, and stapes – amplify the sound. This amplification is crucial, as it ensures that the sound vibrations are strong enough to be conveyed to the inner ear. These vibrations then reach the...
1.4K
Perceiving Loudness, Pitch, and Location01:21

Perceiving Loudness, Pitch, and Location

1.2K
The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...
1.2K
Chunking and Rehearsal in Sensory Memory01:22

Chunking and Rehearsal in Sensory Memory

678
Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of...
678

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A Combined Approach to Predict Tobacco-Induced Facial Aging Using Dermatologist Knowledge Elicitation and Generative Models.

Dermatology and therapy·2025
Same author

Is Crocin a Potential Anti-tumor Candidate Targeting Microtubules? Computational Insights From Molecular Docking and Dynamics Simulations.

Frontiers in molecular biosciences·2020
Same author

Association of liver steatosis and fibrosis with clinical outcomes in patients with SARS-CoV-2 infection (COVID-19).

Annals of hepatology·2020
Same author

Selective Trafficking of Light Chain-Conjugated Nanoparticles to the Kidney and Renal Cell Carcinoma.

Nano today·2020
Same author

Direct Tumor Killing and Immunotherapy through Anti-SerpinB9 Therapy.

Cell·2020
Same author

Design, microwave synthesis, and molecular docking studies of catalpol crotonates as potential neuroprotective agent of diabetic encephalopathy.

Scientific reports·2020
Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Mar 8, 2026

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder
09:13

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Published on: April 22, 2015

17.2K

Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.

Israel D Gebru, Sileye Ba, Xiaofei Li

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |January 20, 2017
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces an audio-visual model for speaker diarization, accurately identifying who spoke when, even in complex multi-party interactions. The novel method effectively handles simultaneous speech and participant movement for improved dialogue analysis.

    Related Experiment Videos

    Last Updated: Mar 8, 2026

    Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder
    09:13

    Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

    Published on: April 22, 2015

    17.2K

    Area of Science:

    • Computer Science
    • Signal Processing
    • Artificial Intelligence

    Background:

    • Speaker diarization aims to segment and label speech segments by speaker.
    • Existing methods struggle with multi-participant interactions, especially when participants are not facing the camera/microphone.
    • Challenges include tracking moving individuals and associating speech with the correct person.

    Purpose of the Study:

    • To propose a novel audio-visual spatiotemporal diarization model.
    • To address the limitations of current diarization systems in complex, dynamic multi-party conversations.
    • To improve the accuracy of speaker identification and speech turn segmentation in challenging scenarios.

    Main Methods:

    • Combines multiple-person visual tracking with multiple speech-source localization.
    • Employs a novel audio-visual fusion technique involving binaural spectral feature extraction, supervised audio-visual alignment, and semi-supervised clustering.
    • Utilizes a latent-variable temporal graphical model for inferring speaker identities and speech turns.

    Main Results:

    • The proposed method effectively handles simultaneous speech from multiple speakers.
    • Achieves accurate speech-to-person association even with moving participants and head turns.
    • Demonstrates efficient exact inference and robust performance compared to state-of-the-art algorithms.

    Conclusions:

    • The developed audio-visual diarization model offers a principled approach to complex dialogue scenarios.
    • The novel fusion and inference methods significantly advance speaker diarization capabilities.
    • Introduces a valuable new dataset for training and evaluating audio-visual diarization systems.