Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Unsupervised speaker recognition based on competition between self-organizing maps.

I Lapidot¹, H Guterman, A Cohen

¹Dept. of Software Eng., Negev Acad. Coll. of Eng., Beer-Sheva, Israel.

IEEE Transactions on Neural Networks

|February 5, 2008

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

tDNA(ser) sequences are involved in the excision of Streptomyces griseus plasmid pSG1.

Gene·1992

Same author

Expression of cytokines and their receptors by human thymocytes and thymic stromal cells.

Immunology·1992

Same author

Conservative surgery and radiation therapy for intraductal carcinoma of the breast.

The Journal of the Florida Medical Association·1992

Same author

Asymmetry in visual search for targets defined by differences in movement speed.

Journal of experimental psychology. Human perception and performance·1992

Same author

Clinical decision making for discharge planning in a changing psychiatric environment.

Health & social work·1992

Same author

Immunization of colorectal cancer patients with modified ovine submaxillary gland mucin and adjuvants induces IgM and IgG antibodies to sialylated Tn.

Cancer research·1992

Same journal

Universal perceptron and DNA-like learning algorithm for binary neural networks: LSBF and PBF implementations.

IEEE transactions on neural networks·2013

Same journal

Guest editorial: special section on white box nonlinear prediction models.

IEEE transactions on neural networks·2011

Same journal

Data-based fault-tolerant control of high-speed trains with traction/braking notch nonlinearities and actuator failures.

IEEE transactions on neural networks·2011

Same journal

Guest editorial: special section on data-based control, modeling, and optimization.

IEEE transactions on neural networks·2011

Same journal

Neural network-based multiple robot simultaneous localization and mapping.

IEEE transactions on neural networks·2011

Same journal

Data-driven model-free adaptive control for a class of MIMO nonlinear discrete-time systems.

IEEE transactions on neural networks·2011

See all related articles

This study introduces a novel speaker clustering method for unlabeled conversations using self-organizing maps (SOMs). The approach accurately identifies speakers and estimates participant numbers in audio data.

Area of Science:

Computational Linguistics
Speech Processing
Machine Learning

Background:

Unlabeled and unsegmented conversational audio presents challenges for speaker identification.
Existing methods often require prior knowledge of speaker identities or segmented audio.
Accurate speaker clustering is crucial for various audio analysis applications.

Purpose of the Study:

To develop and evaluate a method for clustering speakers in unlabeled, unsegmented conversations.
To enable speaker identification without a priori knowledge of participant identities.
To estimate the number of speakers in a conversation.

Main Methods:

Utilized self-organizing maps (SOMs) to model individual speakers.
Employed an iterative clustering algorithm where data points adjust SOMs.

Related Experiment Videos

Implemented a constraint for group-wise data movement to ensure speaker-level adaptation, not phoneme-level.

Main Results:

Achieved over 80% correct segmentation for two- and three-speaker conversations (high- and telephone-quality).
Developed a validity criterion based on the iterative algorithm to estimate the number of speakers.
Correctly estimated the number of participants in 16 out of 17 high-quality conversations (2-3 speakers).

Conclusions:

The proposed SOM-based iterative clustering method effectively segments and identifies speakers in unlabeled conversations.
The developed validity criterion shows promise for automatically determining the number of speakers.
Performance is robust for high-quality audio, with potential for improvement in lower-quality telephone conversations.