Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Hypothesis testing for evaluating a multimodal pattern recognition framework applied to speaker detection.

Patricia Besson¹, Murat Kunt

¹Signal Processing Institute (ITS), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland. patricia.besson@univmed.fr

Journal of Neuroengineering and Rehabilitation

|March 29, 2008

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Cognitive Workload and Psychophysiological Parameters During Multitask Activity in Helicopter Pilots.

Aerospace medicine and human performance·2015

Same author

A comprehensive model of audiovisual perception: both percept and temporal dynamics.

PloS one·2011

Same author

Bayesian networks and information theory for audio-visual perception modeling.

Biological cybernetics·2010

Same author

Visually improved image compression by combining a conventional wavelet-codec with texture modeling.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2008

Same author

Wavelet-based color image compression: exploiting the contrast sensitivity function.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2008

Same author

An open Internet platform to distributed image processing applied to dermoscopy.

Studies in health technology and informatics·2003

Same journal

Immediate changes during dysesthesia-matched transcutaneous electrical nerve stimulation in refractory neuropathic pain: a retrospective observational case series.

Journal of neuroengineering and rehabilitation·2026

Same journal

Sensor-derived heel pressure metrics capture reversible gait dysfunction beyond conventional gait measures in normal pressure hydrocephalus.

Journal of neuroengineering and rehabilitation·2026

Same journal

Determination of cut-off points for the Move4 accelerometer and assessment of energy expenditure in children and adolescents aged 6-16 years using manual wheelchairs: a validation and calibration study.

Journal of neuroengineering and rehabilitation·2026

Same journal

Safety, feasibility and preliminary effects of Atalante exoskeleton-assisted gait training in amyotrophic lateral sclerosis: a prospective ABA pilot study.

Journal of neuroengineering and rehabilitation·2026

Same journal

Effects of repetitive transcranial magnetic stimulation on cognition through sleep slow-wave activity in older adults.

Journal of neuroengineering and rehabilitation·2026

Same journal

Peripheral and central vestibular neuromodulation improve postural control in adolescent idiopathic scoliosis: a randomized, sham-controlled, multi-arm intervention study.

Journal of neuroengineering and rehabilitation·2026

See all related articles

This study introduces a multimodal pattern recognition system for speaker detection using audio-visual synchrony. The proposed feature extraction method enhances classifier performance and system efficiency.

Area of Science:

Human-Computer Interaction
Pattern Recognition
Signal Processing

Background:

Speaker detection is crucial for applications like multimedia indexing and intelligent systems.
This research focuses on detecting the current speaker in audio-visual sequences.
The system requires minimal hardware: a single camera and microphone.

Purpose of the Study:

To propose a multimodal pattern recognition framework for speaker detection.
To evaluate the effectiveness of an information-theoretic feature extraction step.
To assess system performance using a hypothesis testing framework.

Main Methods:

A multimodal pattern recognition framework integrating feature generation, extraction, and classification.
Utilizing audio-visual synchrony estimation for speaker identification.

Related Experiment Videos

Employing an information theoretic approach for optimized audio feature extraction.

Implementing a hypothesis testing framework for classification and performance evaluation.

Main Results:

The hypothesis testing approach provides detection and false-alarm probabilities.
The proposed feature extraction step significantly improves classifier performance.
The efficiency of the entire pattern recognition process is measurable and enhanced.

Conclusions:

Hypothesis testing is a powerful tool for evaluating multimodal pattern recognition systems.
The feature extraction step demonstrably benefits speaker detection performance.
The framework is adaptable for other classification tasks involving co-occurring spatio-temporal signals.