Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Auditory Pathway

Auditory Pathway

Auditory pathways constitute the complex neural circuits responsible for transmitting and interpreting auditory information from the peripheral auditory system to the brain. Sound waves are initially captured by the outer ear, funneled through the ear canal, and reach the tympanic membrane (eardrum). These vibrations are transmitted via the middle ear's ossicles to the inner ear's cochlea.
When viewed cross-sectionally, the cochlea reveals the scala vestibuli and scala tympani flanking...

Auditory Perception

Auditory Perception

The auditory system is essential for sound perception, utilizing various critical structures. When sound waves enter the outer ear, they travel through the ear canal and cause the eardrum to vibrate. These vibrations are then transmitted to the middle ear, where three tiny bones – the malleus, incus, and stapes – amplify the sound. This amplification is crucial, as it ensures that the sound vibrations are strong enough to be conveyed to the inner ear. These vibrations then reach the...

Perceiving Loudness, Pitch, and Location

Perceiving Loudness, Pitch, and Location

The human brain perceives pitch through two primary mechanisms reflected in place theory and frequency theory. Each mechanism describes how sound waves are interpreted as specific pitches by the brain, offering insights into the intricate processes of auditory perception.
Place theory, or place coding, suggests that different pitches are heard because various sound waves activate specific locations along the cochlea's basilar membrane. The brain determines the pitch of a sound by...

Chunking and Rehearsal in Sensory Memory

Chunking and Rehearsal in Sensory Memory

Improving short-term memory can be achieved through techniques like chunking and rehearsal. Chunking involves organizing information into larger, more manageable units. This technique is particularly useful for information that exceeds the typical memory span of between five and nine items. For instance, logging into an online account with a password like "ta89vq0179gz" involves grouping letters and numbers into three chunks—ta89, vq01, and 79gz. It makes large amounts of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A Combined Approach to Predict Tobacco-Induced Facial Aging Using Dermatologist Knowledge Elicitation and Generative Models.

Dermatology and therapy·2025

Same author

Is Crocin a Potential Anti-tumor Candidate Targeting Microtubules? Computational Insights From Molecular Docking and Dynamics Simulations.

Frontiers in molecular biosciences·2020

Same author

Association of liver steatosis and fibrosis with clinical outcomes in patients with SARS-CoV-2 infection (COVID-19).

Annals of hepatology·2020

Same author

Selective Trafficking of Light Chain-Conjugated Nanoparticles to the Kidney and Renal Cell Carcinoma.

Nano today·2020

Same author

Direct Tumor Killing and Immunotherapy through Anti-SerpinB9 Therapy.

Cell·2020

Same author

Design, microwave synthesis, and molecular docking studies of catalpol crotonates as potential neuroprotective agent of diabetic encephalopathy.

Scientific reports·2020

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 8, 2026

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Published on: April 22, 2015

Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.

Israel D Gebru, Sileye Ba, Xiaofei Li

IEEE Transactions on Pattern Analysis and Machine Intelligence

|January 20, 2017

Summary

This summary is machine-generated.

This study introduces an audio-visual model for speaker diarization, accurately identifying who spoke when, even in complex multi-party interactions. The novel method effectively handles simultaneous speech and participant movement for improved dialogue analysis.

Related Experiment Videos

Last Updated: Mar 8, 2026

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Testing Sensory and Multisensory Function in Children with Autism Spectrum Disorder

Published on: April 22, 2015

Area of Science:

Computer Science
Signal Processing
Artificial Intelligence

Background:

Speaker diarization aims to segment and label speech segments by speaker.
Existing methods struggle with multi-participant interactions, especially when participants are not facing the camera/microphone.
Challenges include tracking moving individuals and associating speech with the correct person.

Purpose of the Study:

To propose a novel audio-visual spatiotemporal diarization model.
To address the limitations of current diarization systems in complex, dynamic multi-party conversations.
To improve the accuracy of speaker identification and speech turn segmentation in challenging scenarios.

Main Methods:

Combines multiple-person visual tracking with multiple speech-source localization.
Employs a novel audio-visual fusion technique involving binaural spectral feature extraction, supervised audio-visual alignment, and semi-supervised clustering.
Utilizes a latent-variable temporal graphical model for inferring speaker identities and speech turns.

Main Results:

The proposed method effectively handles simultaneous speech from multiple speakers.
Achieves accurate speech-to-person association even with moving participants and head turns.
Demonstrates efficient exact inference and robust performance compared to state-of-the-art algorithms.

Conclusions:

The developed audio-visual diarization model offers a principled approach to complex dialogue scenarios.
The novel fusion and inference methods significantly advance speaker diarization capabilities.
Introduces a valuable new dataset for training and evaluating audio-visual diarization systems.