Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

State Space Representation

State Space Representation

The frequency-domain technique, commonly used in analyzing and designing feedback control systems, is effective for linear, time-invariant systems. However, it falls short when dealing with nonlinear, time-varying, and multiple-input multiple-output systems. The time-domain or state-space approach addresses these limitations by utilizing state variables to construct simultaneous, first-order differential equations, known as state equations, for an nth-order system.
Consider an RLC circuit, a...

Control Volume and System Representations

Control Volume and System Representations

Two key frameworks are employed to analyze mass, energy, and momentum transfer: the control volume approach and the system approach. These frameworks offer different perspectives, depending on whether the focus is on a specific region in space (control volume approach) or a defined mass of fluid (system approach).
The control volume approach considers a stationary region in space through which fluid flows. This region is bounded by a control surface. For instance, in the case of water...

Components of Language

Components of Language

Language, whether spoken, signed, or written, consists of specific components: lexicon and grammar. The lexicon is the vocabulary of a language, comprising its words. Grammar is the set of rules used to convey meaning through the lexicon. For example, English grammar adds “-ed” to most verbs to indicate past tense. Words are formed by combining phonemes, which are the basic sound units of a language. Different languages have different sets of phonemes (e.g., “ah” vs.

Language and Cognition

Language and Cognition

Language serves as a bridge between ideas and communication, influencing how individuals perceive and interact with the world. Psychologists have long debated whether language shapes thought or vice versa. This discussion gained grip with Edward Sapir and Benjamin Lee Whorf in the 1940s, who proposed that language determines thought, a concept known as linguistic determinism. They suggested that the vocabulary and structure of a language influence how its speakers think and perceive reality.

Stereotype Content Model

Stereotype Content Model

The Stereotype Content Model (SCM) was first proposed by Susan Fiske and her colleagues (Fiske, Cuddy, Glick & Xu, 2002; see also Fiske, 2012 and Fiske, 2017). The SCM specifies that when someone encounters a new group, they will stereotype them based on two metrics: warmth—or that group’s perceived intent, and how likely they are to provide help or inflict harm—and competence—or their ability to carry out that objective. Depending on the warmth-competence...

Language Development

Language Development

Children master language quickly and with relative ease, supported by both biological predisposition and reinforcement. B. F. Skinner (1957) proposed that language is learned through reinforcement, while Noam Chomsky (1965) argued that language acquisition mechanisms are biologically determined.
The critical period for language acquisition suggests that the ability to acquire language is at its peak early in life. As people age, this proficiency decreases. Language development begins very...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Network-Texture-Induced Uniform Nucleation: Controllable Preparation and Application of High-Performance CsPbI<sub>3</sub> Nanocrystals in Al<sup>3+</sup>/Gd<sup>3+</sup> Co-Doped Glass.

Inorganic chemistry·2026

Same author

A pilot study of Galectin-3 targeting in chronic kidney fibrosis and kidney function decline.

BMC nephrology·2026

Same author

ENPP1 blockade with a humanized monoclonal antibody enhances renal repair after acute kidney injury.

Cell stem cell·2026

Same author

Anomalous luminescence properties in Dy<sup>3+</sup>-doped Bi<sub>2</sub>O<sub>3</sub>-B<sub>2</sub>O<sub>3</sub>-SiO<sub>2</sub> glasses at high silver concentrations.

Applied optics·2026

Same author

Language Supervised Multi-Camera Multi-Object Tracking.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Is Coffee Consumption Associated With Increased Risk of Atrial Fibrillation: A Systematic Review and a Meta-Analysis.

Pacing and clinical electrophysiology : PACE·2026

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 6, 2026

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

A compact representation of visual speech data using latent variables.

Ziheng Zhou¹, Xiaopeng Hong, Guoying Zhao

¹University of Oulu, Oulu.

IEEE Transactions on Pattern Analysis and Machine Intelligence

|November 16, 2013

Summary

This summary is machine-generated.

This study introduces a new generative model for visual speech recognition. It compactly represents talking mouth movements by separating speaker and utterance variations for better decoding.

More Related Videos

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language

Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language

Published on: October 13, 2018

Related Experiment Videos

Last Updated: May 6, 2026

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

Published on: August 9, 2024

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Foreign Accent and Forensic Speaker Identification in Voice Lineups: The Influence of Acoustic Features Based on Prosody

Published on: September 27, 2024

Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language

Using Eye Movements Recorded in the Visual World Paradigm to Explore the Online Processing of Spoken Language

Published on: October 13, 2018

Area of Science:

Computer Vision
Machine Learning
Speech Processing

Background:

Visual speech recognition decodes talking mouth movements in high-dimensional visual spaces.
Existing methods face challenges in compactly representing complex visual speech data.

Purpose of the Study:

To propose a generative latent variable model for compact visual speech data representation.
To effectively model inter-speaker variations and utterance-specific dynamics.

Main Methods:

Developed a generative latent variable model.
Utilized latent variables to distinguish between visual appearance variations (inter-speaker) and utterance dynamics.
Incorporated structural information via priors on latent variables along a path graph.

Main Results:

The proposed model offers a compact representation of visual speech data.
Demonstrated effective separation of inter-speaker variations and utterance-specific visual dynamics.
Integrated structural information for enhanced data representation.

Conclusions:

The generative latent variable model provides an effective approach for compact visual speech data representation.
This method enhances visual speech recognition by modeling key variations distinctly.
The approach shows promise for improving the decoding of visual speech dynamics.