Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Hypothesis testing for evaluating a multimodal pattern recognition framework applied to speaker detection.

Patricia Besson1, Murat Kunt

  • 1Signal Processing Institute (ITS), Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland. patricia.besson@univmed.fr

Journal of Neuroengineering and Rehabilitation
|March 29, 2008
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Cognitive Workload and Psychophysiological Parameters During Multitask Activity in Helicopter Pilots.

Aerospace medicine and human performance·2015
Same author

A comprehensive model of audiovisual perception: both percept and temporal dynamics.

PloS one·2011
Same author

Bayesian networks and information theory for audio-visual perception modeling.

Biological cybernetics·2010
Same author

Visually improved image compression by combining a conventional wavelet-codec with texture modeling.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2008
Same author

Wavelet-based color image compression: exploiting the contrast sensitivity function.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2008
Same author

An open Internet platform to distributed image processing applied to dermoscopy.

Studies in health technology and informatics·2003
Same journal

Immediate changes during dysesthesia-matched transcutaneous electrical nerve stimulation in refractory neuropathic pain: a retrospective observational case series.

Journal of neuroengineering and rehabilitation·2026
Same journal

Sensor-derived heel pressure metrics capture reversible gait dysfunction beyond conventional gait measures in normal pressure hydrocephalus.

Journal of neuroengineering and rehabilitation·2026
Same journal

Determination of cut-off points for the Move4 accelerometer and assessment of energy expenditure in children and adolescents aged 6-16 years using manual wheelchairs: a validation and calibration study.

Journal of neuroengineering and rehabilitation·2026
Same journal

Safety, feasibility and preliminary effects of Atalante exoskeleton-assisted gait training in amyotrophic lateral sclerosis: a prospective ABA pilot study.

Journal of neuroengineering and rehabilitation·2026
Same journal

Effects of repetitive transcranial magnetic stimulation on cognition through sleep slow-wave activity in older adults.

Journal of neuroengineering and rehabilitation·2026
Same journal

Peripheral and central vestibular neuromodulation improve postural control in adolescent idiopathic scoliosis: a randomized, sham-controlled, multi-arm intervention study.

Journal of neuroengineering and rehabilitation·2026
See all related articles

This study introduces a multimodal pattern recognition system for speaker detection using audio-visual synchrony. The proposed feature extraction method enhances classifier performance and system efficiency.

Area of Science:

  • Human-Computer Interaction
  • Pattern Recognition
  • Signal Processing

Background:

  • Speaker detection is crucial for applications like multimedia indexing and intelligent systems.
  • This research focuses on detecting the current speaker in audio-visual sequences.
  • The system requires minimal hardware: a single camera and microphone.

Purpose of the Study:

  • To propose a multimodal pattern recognition framework for speaker detection.
  • To evaluate the effectiveness of an information-theoretic feature extraction step.
  • To assess system performance using a hypothesis testing framework.

Main Methods:

  • A multimodal pattern recognition framework integrating feature generation, extraction, and classification.
  • Utilizing audio-visual synchrony estimation for speaker identification.

Related Experiment Videos

  • Employing an information theoretic approach for optimized audio feature extraction.
  • Implementing a hypothesis testing framework for classification and performance evaluation.
  • Main Results:

    • The hypothesis testing approach provides detection and false-alarm probabilities.
    • The proposed feature extraction step significantly improves classifier performance.
    • The efficiency of the entire pattern recognition process is measurable and enhanced.

    Conclusions:

    • Hypothesis testing is a powerful tool for evaluating multimodal pattern recognition systems.
    • The feature extraction step demonstrably benefits speaker detection performance.
    • The framework is adaptable for other classification tasks involving co-occurring spatio-temporal signals.