Jove
Visualize
Contact Us

Related Experiment Videos

Lipreading from color video.

G I Chiou1, J N Hwang

  • 1Dept. of Electr. Eng., Washington Univ., Seattle, WA.

IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
|January 1, 1997
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Contour tracking using a knowledge-based snake algorithm to construct three-dimensional pharyngeal bolus movement.

Dysphagia·1999
Same author

Closed contour edge detection of blood vessel lumen and outer wall boundaries in black-blood MR images.

Magnetic resonance imaging·1999
Same author

Three-dimensional object representation and invariant recognition using continuous distance transform neural networks.

IEEE transactions on neural networks·1997
Same author

Robust speech recognition based on joint model and feature space optimization of hidden Markov models.

IEEE transactions on neural networks·1997
Same author

The cascade-correlation learning: a projection pursuit learning perspective.

IEEE transactions on neural networks·1996
Same author

A neural network-based stochastic active contour model (NNS-SNAKE) for contour finding of distinct features.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·1995
Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

This study presents a lipreading system using only lip color video for word recognition. The visual-only system achieved 94% accuracy for ten isolated words.

Area of Science:

  • Computer Vision
  • Biomedical Engineering
  • Speech Recognition

Background:

  • Traditional speech recognition relies heavily on acoustic data, limiting its use in noisy environments or for individuals with speech impairments.
  • Visual speech information, particularly lip movements, offers a complementary modality for speech recognition.
  • Developing robust lipreading systems is crucial for advancing human-computer interaction and assistive technologies.

Purpose of the Study:

  • To design and implement a novel lipreading system capable of recognizing isolated words using solely visual information from lip movements.
  • To evaluate the system's performance and accuracy in a controlled setting without acoustic data.
  • To explore the efficacy of combining advanced image processing and machine learning techniques for visual speech recognition.

Related Experiment Videos

Main Methods:

  • Utilized "snakes" for extracting visual features from the geometric space of lip movements.
  • Applied Karhunen-Loeve transform (KLT) to identify principal components within the color eigenspace of lip images.
  • Employed hidden Markov models (HMMs) for the sequential recognition of combined visual features.

Main Results:

  • The lipreading system demonstrated high accuracy, achieving 94% recognition rate for ten isolated words.
  • The system successfully performed word recognition using only color video of human lips, without any acoustic data.
  • The combination of "snakes," KLT, and HMMs proved effective for visual feature extraction and sequence recognition.

Conclusions:

  • Lipreading systems utilizing only visual information can achieve high accuracy in recognizing isolated words.
  • The developed system offers a promising alternative or supplement to acoustic-based speech recognition, particularly in challenging acoustic environments.
  • This research highlights the potential of advanced computer vision and machine learning techniques for robust visual speech recognition applications.