Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Gradient and Del Operator

Gradient and Del Operator

In mathematics and physics, the gradient and del operator are fundamental concepts used to describe the behavior of functions and fields in space. The gradient is a mathematical operator that gives both the magnitude and direction of the maximum spatial rate of change. Consider a person standing on a mountain. The slope of the mountain at any given point is not defined unless it is quantified in a particular direction. For this reason, a "directional derivative" is defined, which is a vector...

Linear Approximation in Frequency Domain

Linear Approximation in Frequency Domain

Linear systems are characterized by two main properties: superposition and homogeneity. Superposition allows the response to multiple inputs to be the sum of the responses to each individual input. Homogeneity ensures that scaling an input by a scalar results in the response being scaled by the same scalar.
In contrast, nonlinear systems do not inherently possess these properties. However, for small deviations around an operating point, a nonlinear system can often be approximated as linear....

Random Error

Random Error

Random or indeterminate errors originate from various uncontrollable variables, such as variations in environmental conditions, instrument imperfections, or the inherent variability of the phenomena being measured. Usually, these errors cannot be predicted, estimated, or characterized because their direction and magnitude often vary in magnitude and direction even during consecutive measurements. As a result, they are difficult to eliminate. However, the aggregate effect of these errors can be...

Reducing Line Loss

Reducing Line Loss

In a three-phase circuit, line loss is an indicator of energy dissipated as heat due to the resistance of transmission lines. To address this, incorporating transformers into the system—a step-up transformer at the source and a step-down transformer at the load—is a strategic solution. Two three-phase transformers are introduced to improve this.
With a step-up transformer at the source, the voltage is increased, thereby reducing the current in the transmission lines since power loss...

Upsampling

Upsampling

Managing signal sampling rates is essential in digital signal processing to maintain signal integrity. A decimated signal, characterized by a reduced frequency range due to its lower sampling rate, can be upsampled by inserting zeros between each sample. This upsampling process expands the original spectrum and introduces repeated spectral replicas at intervals dictated by the new Nyquist frequency. To refine this zero-inserted sequence, it is passed through a lowpass filter with a cutoff...

Random Variables

Random Variables

A random variable is a single numerical value that indicates the outcome of a procedure. The concept of random variables is fundamental to the probability theory and was introduced by a Russian mathematician, Pafnuty Chebyshev, in the mid-nineteenth century.
Uppercase letters such as X or Y denote a random variable. Lowercase letters like x or y denote the value of a random variable. If X is a random variable, then X is written in words, and x is given as a number.
For example, let X = the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Edge of Stability Echo State Network.

IEEE transactions on neural networks and learning systems·2024

Same author

Cross frequency coupling in next generation inhibitory neural mass models.

Chaos (Woodbury, N.Y.)·2020

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026

Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 24, 2025

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Random Orthogonal Additive Filters: A Solution to the Vanishing/Exploding Gradient of Deep Neural Networks.

IEEE Transactions on Neural Networks and Learning Systems

|March 3, 2025

Summary

This summary is machine-generated.

A novel neural network (NN) architecture addresses the vanishing/exploding (V/E) gradient problem by ensuring approximate dynamical isometry. This approach enables training extremely deep networks and enhances recurrent neural networks (RNNs) for long-term dependencies.

More Related Videos

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

Related Experiment Videos

Last Updated: May 24, 2025

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Author Spotlight: Advancing Alzheimer's Research – Exploring Early Detection and Multi-Omics Approaches

Published on: December 15, 2023

Area of Science:

Artificial Intelligence
Machine Learning
Deep Learning

Background:

The vanishing/exploding (V/E) gradient problem has hindered neural network (NN) training since the early 1990s.
Existing solutions have not fully resolved this fundamental obstacle in deep learning.

Purpose of the Study:

To develop a novel NN architecture that overcomes the V/E gradient issue.
To achieve stable training for extremely deep neural networks and improve performance on tasks with long-term dependencies.

Main Methods:

The study proposes an architecture based on approximate dynamical isometry, where singular values of the input-output Jacobian (IOJ) are centered around 1.
This involves filtering previous activations orthogonally and combining them with nonlinear activations of the next layer, creating a convex combination.
Analytical bounds demonstrate the impossibility of gradient vanishing or exploding, even for infinite-depth networks.

Main Results:

Training of a 50,000-layer multilayer perceptron (MLP) and an Elman NN for 10,000 time steps was successfully demonstrated.
The proposed model shows superior performance and simplicity compared to architectures like LSTMs.
A single-layer recurrent neural network (RNN) enhanced with this method achieved state-of-the-art results, reaching over 98% accuracy on the psMNIST task within ten epochs.

Conclusions:

The novel architecture effectively solves the V/E gradient problem, enabling unprecedented network depths.
This approach offers a simpler and more effective alternative to existing methods for handling long-term dependencies.
The findings pave the way for more efficient and powerful deep learning models.