Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Types Of Transformers01:16

Types Of Transformers

1.1K
Transformers can provide desired voltages to a circuit by modifying the number of turns in the secondary windings.
If the ratio of the number of turns in the secondary winding to that of the primary winding is greater than one, then the transformer is said to be a step-up transformer. In a step-up transformer, the voltage at the secondary winding is greater than the voltage applied at the primary winding.
However, if this ratio is less than one, the transformer is said to be a step-down...
1.1K
The Ideal Transformer01:26

The Ideal Transformer

942
In single-phase two-winding transformers, two windings are coiled around a magnetic core characterized by cross-sectional area A and magnetic permeability μ. A phasor current i1 enters the left winding while i2 exits the right winding, establishing the fundamental working of the transformer through electromagnetic principles.
Ampere's Law forms the basis of understanding the magnetic field within the transformer. It states that the integral of the magnetic field intensity's...
942
Multi-input and Multi-variable systems01:22

Multi-input and Multi-variable systems

178
Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...
178
Classification of Signals01:30

Classification of Signals

975
In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...
975
Transformers in Distribution System01:27

Transformers in Distribution System

170
Transformers in distribution systems can be broadly categorized into distribution substation transformers and other distribution transformers. They are crucial for stepping down high transmission voltages to levels suitable for distribution and end-user applications.
Distribution substation transformers come in various ratings and typically use mineral oil for insulation and cooling. To prevent moisture and air from entering the oil, some transformers use an inert gas like nitrogen to fill the...
170
Neural Circuits01:25

Neural Circuits

1.7K
Neural circuits and neuronal pools are two of the main structures found in the nervous system. Neural circuits are networks of neurons that work together to carry out a specific task or process. They consist of interconnected neurons and glial cells, which provide structural and metabolic support.
Neuronal pools are collections of nerve cells with similar functions and interact through chemical and electrical signals. These pools include both interneurons (the central neural circuit nodes that...
1.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Environmental triggers of a Microcystis (Cyanophyceae) bloom in an artificial lagoon of Hangzhou Bay, China.

Marine pollution bulletin·2018
Same author

Quantitative evaluation of retinal artery occlusion using optical coherence tomography angiography: A case report.

Medicine·2018
Same author

Multibandgap quantum dot ensembles for solar-matched infrared energy harvesting.

Nature communications·2018
Same author

Butylamine-Catalyzed Synthesis of Nanocrystal Inks Enables Efficient Infrared CQD Solar Cells.

Advanced materials (Deerfield Beach, Fla.)·2018
Same author

A secretory hexokinase plays an active role in the proliferation of <i>Nosema bombycis</i>.

PeerJ·2018
Same author

Discovery of furyl/thienyl β-carboline derivatives as potent and selective PDE5 inhibitors with excellent vasorelaxant effect.

European journal of medicinal chemistry·2018
Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026
Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026
Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Video

Updated: Sep 27, 2025

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology
09:44

Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

Published on: March 8, 2024

5.1K

Multimodal Sparse Transformer Network for Audio-Visual Speech Recognition.

Qiya Song, Bin Sun, Shutao Li

    IEEE Transactions on Neural Networks and Learning Systems
    |April 12, 2022
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a multimodal sparse transformer network (MMST) to improve audio-visual speech recognition (AVSR) in noisy environments. The novel approach enhances visual features with motion information, significantly reducing word error rates for more robust speech recognition.

    More Related Videos

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
    04:23

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

    Published on: April 21, 2023

    2.0K
    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
    05:48

    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

    Published on: August 9, 2024

    1.7K

    Related Experiment Videos

    Last Updated: Sep 27, 2025

    Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology
    09:44

    Author Spotlight: Advancing Large-Scale Neural Dynamics Through HD-MEA Technology

    Published on: March 8, 2024

    5.1K
    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images
    04:23

    A Swin Transformer-Based Model for Thyroid Nodule Detection in Ultrasound Images

    Published on: April 21, 2023

    2.0K
    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception
    05:48

    Author Spotlight: Investigating the Impact of Emotional Prosodies on Voice Recognition and Perception

    Published on: August 9, 2024

    1.7K

    Area of Science:

    • Artificial Intelligence
    • Computer Vision
    • Speech Processing

    Background:

    • Automatic speech recognition (ASR) systems face performance degradation in noisy conditions.
    • Audio-visual speech recognition (AVSR) uses visual cues to enhance ASR, especially in adverse environments.
    • Transformer architectures show promise in AVSR but struggle with irrelevant information and lack motion feature integration.

    Purpose of the Study:

    • To propose a novel multimodal sparse transformer network (MMST) for enhanced AVSR.
    • To address limitations of existing transformer models in handling long-term dependencies and irrelevant information.
    • To incorporate essential motion features into AVSR for improved spatio-temporal visual information utilization.

    Main Methods:

    • Developed a multimodal sparse transformer network (MMST) incorporating sparse self-attention.
    • Integrated motion features into the visual modality processing.
    • Utilized a cross-modal attention module for seamless information flow between motion and visual modalities.

    Main Results:

    • The MMST model demonstrated improved attention concentration on relevant global information.
    • Integration of motion features enhanced visual feature representation.
    • Experiments showed significant performance improvements over state-of-the-art methods on various datasets.
    • Reduced word error rate (WER) was achieved, indicating superior recognition accuracy.

    Conclusions:

    • The proposed MMST effectively enhances audio-visual speech recognition performance, particularly in noisy conditions.
    • Incorporating motion features and sparse attention mechanisms are crucial for robust AVSR.
    • The MMST offers a promising direction for developing more reliable human-machine interfaces.