Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Visual System01:26

Visual System

1.8K
Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...
1.8K
Visual Agnosia01:12

Visual Agnosia

1.1K
Visual agnosia is a condition characterized by the inability to recognize visually presented objects despite having normal vision. For instance, a person with visual agnosia can describe the shape and color of an object but cannot identify or name it. This impairment does not affect their visual field, acuity, color vision, brightness discrimination, language, or memory. An example of this condition in a social setting is someone at a dinner party asking for "that silver thing with a round...
1.1K
Photoreceptors and Visual Pathways01:22

Photoreceptors and Visual Pathways

9.2K
At the molecular level, visual signals trigger transformations in photopigment molecules, resulting in changes in the photoreceptor cell's membrane potential. The photon's energy level is denoted by its wavelength, with each specific wavelength of visible light associated with a distinct color. The spectral range of visible light, classified as electromagnetic radiation, spans from 380 to 720 nm. Electromagnetic radiation wavelengths exceeding 720 nm fall under the infrared category,...
9.2K
Hearing01:31

Hearing

57.2K
When we hear a sound, our nervous system is detecting sound waves—pressure waves of mechanical energy traveling through a medium. The frequency of the wave is perceived as pitch, while the amplitude is perceived as loudness.
57.2K
Vision01:24

Vision

60.0K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
60.0K
Base Excision Repair01:54

Base Excision Repair

26.3K
One of the common DNA damages is the chemical alteration of single bases by alkylation, oxidation, or deamination. The altered bases cause mispairing and strand breakage during replication. This type of damage causes minimal change to the DNA double helix structure and can be repaired by the base excision repair (BER) pathways. BER corrects damaged DNA sequences by removing the damaged base and restoring the original base sequence using the complementary strand as a template.
The first step of...
26.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The Sound of Water: Inferring Physical Properties from Pouring Liquids.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Advancing regulatory variant effect prediction with AlphaGenome.

Nature·2026
Same author

Identifying scoliosis in a population-based adult cohort: automation of a validated method based on total body dual energy X-ray absorptiometry scans.

European spine journal : official publication of the European Spine Society, the European Spinal Deformity Society, and the European Section of the Cervical Spine Research Society·2026
Same author

Detect+Track: robust and flexible software tools for improved tracking and behavioural analysis of fish.

Royal Society open science·2025
Same author

EPIC-SOUNDS: A Large-Scale Dataset of Actions That Sound.

IEEE transactions on pattern analysis and machine intelligence·2025
Same author

Automated detection of spinal bone marrow oedema in axial spondyloarthritis: training and validation using two large phase 3 trial datasets.

Rheumatology (Oxford, England)·2025
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Jan 31, 2026

Investigating the Effect of Visual Imagery and Learning Shape-Audio Regularities on Bouba and Kiki
07:31

Investigating the Effect of Visual Imagery and Learning Shape-Audio Regularities on Bouba and Kiki

Published on: September 13, 2019

10.6K

Deep Audio-Visual Speech Recognition.

Triantafyllos Afouras, Joon Son Chung, Andrew Senior

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |December 25, 2018
    PubMed
    Summary
    This summary is machine-generated.

    This study advances lip reading AI for unconstrained sentences in natural videos. New models and a large dataset significantly improve performance, showing lip reading complements noisy audio recognition.

    More Related Videos

    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
    05:48

    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

    Published on: August 9, 2024

    2.0K
    Ultrasound Images of the Tongue: A Tutorial for Assessment and Remediation of Speech Sound Errors
    08:32

    Ultrasound Images of the Tongue: A Tutorial for Assessment and Remediation of Speech Sound Errors

    Published on: January 3, 2017

    23.2K

    Related Experiment Videos

    Last Updated: Jan 31, 2026

    Investigating the Effect of Visual Imagery and Learning Shape-Audio Regularities on Bouba and Kiki
    07:31

    Investigating the Effect of Visual Imagery and Learning Shape-Audio Regularities on Bouba and Kiki

    Published on: September 13, 2019

    10.6K
    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
    05:48

    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

    Published on: August 9, 2024

    2.0K
    Ultrasound Images of the Tongue: A Tutorial for Assessment and Remediation of Speech Sound Errors
    08:32

    Ultrasound Images of the Tongue: A Tutorial for Assessment and Remediation of Speech Sound Errors

    Published on: January 3, 2017

    23.2K

    Area of Science:

    • Computer Science
    • Artificial Intelligence
    • Machine Learning

    Background:

    • Lip reading, or visual speech recognition, traditionally focused on limited vocabularies.
    • Existing methods struggle with natural, unconstrained language and real-world video conditions.

    Purpose of the Study:

    • To develop and evaluate advanced lip reading models for open-world, natural language sentence recognition.
    • To assess the complementary role of lip reading alongside audio in noisy environments.
    • To introduce a novel, large-scale dataset for audio-visual speech recognition research.

    Main Methods:

    • Comparison of two transformer-based self-attention models utilizing CTC loss and sequence-to-sequence loss.
    • Training and evaluation on a new, extensive dataset (LRS2-BBC) of natural sentences from broadcast television.
    • Investigating the synergy between visual speech recognition and noisy audio speech recognition.

    Main Results:

    • Trained models significantly outperformed previous benchmarks on lip reading tasks.
    • Demonstrated the effectiveness of lip reading as a complementary modality to audio, especially under acoustic interference.
    • The LRS2-BBC dataset provides a valuable resource for advancing audio-visual speech recognition.

    Conclusions:

    • Open-world lip reading is feasible with advanced deep learning architectures.
    • Lip reading offers substantial benefits in noisy conditions, enhancing overall speech recognition accuracy.
    • The LRS2-BBC dataset facilitates future research in robust audio-visual speech recognition.