Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Residuals and Least-Squares Property01:11

Residuals and Least-Squares Property

7.8K
The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...
7.8K
Multi-input and Multi-variable systems01:22

Multi-input and Multi-variable systems

149
Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...
149
Associative Learning01:27

Associative Learning

569
Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...
569
Observational Learning01:12

Observational Learning

310
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
310
Improving Translational Accuracy02:07

Improving Translational Accuracy

11.8K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
11.8K
Regression Toward the Mean01:52

Regression Toward the Mean

6.5K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Development and Validation of a Nomogram to Predict Liver Metastases in Patients With Gastric Cancer.

Indian journal of surgical oncology·2026
Same author

One-Shot Pd(II)-Catalyzed Multiple C-H Activation Enables Modular Construction of Fluorenylidene Oxindole-Based Multi(Polycyclic) Aromatic Enes.

Chemistry (Weinheim an der Bergstrasse, Germany)·2026
Same author

Assessing the reliability and quality of avascular necrosis of the femoral head content on social media: a cross-sectional content analysis.

Scientific reports·2026
Same author

Development of a machine learning-based mortality prediction model for patients with mental disorders and COVID-19.

Frontiers in cellular and infection microbiology·2026
Same author

Targeted affinity fishing of components from the n-butanol extract of Gualou-Xiebai-Banxia decoction for the FGF21/FGFR1/βKlotho-FRS2α pathway and verification of their activities.

Journal of chromatography. B, Analytical technologies in the biomedical and life sciences·2026
Same author

Evolutionary insight and characterization of WOX genes in callus development and differentiation of Peucedanum praeruptorum.

Planta·2026
Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

IGFD-Net: Illumination-guided frequency decoupling for polarization image fusion.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Multiple-Strategies dung beetle optimizer and its applications in engineering optimization and bankruptcy prediction.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Aggregating global-scale pixel-wise forgery cues within a graph.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Finite-Time intermittent control for secure synchronization of Neutral-Type stochastic delayed neural networks under aperiodic DoS attacks.

Neural networks : the official journal of the International Neural Network Society·2026
See all related articles

Related Experiment Video

Updated: Sep 9, 2025

Deep Neural Networks for Image-Based Dietary Assessment
13:19

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

9.3K

Rethinking softmax in incremental learning.

Zheng Zhai1, Jiali Zhang2, Haiyu Wang3

  • 1Department of Statistics, Faculty of Arts and Sciences, Beijing Normal University, Zhuhai, Guangdong, China.

Neural Networks : the Official Journal of the International Neural Network Society
|September 1, 2025
PubMed
Summary
This summary is machine-generated.

This study addresses catastrophic forgetting in incremental learning by introducing new distillation losses. Our methods improve accuracy and reduce forgetting in machine learning models.

Keywords:
Catastrophic forgettingContinual learningDistillation lossIncremental learningLife-long learning

More Related Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

681

Related Experiment Videos

Last Updated: Sep 9, 2025

Deep Neural Networks for Image-Based Dietary Assessment
13:19

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

9.3K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

681

Area of Science:

  • Machine Learning
  • Artificial Intelligence
  • Deep Learning

Background:

  • Catastrophic forgetting is a major hurdle in incremental learning, where models forget previously learned information when trained on new data.
  • The standard softmax cross-entropy distillation loss suffers from non-identifiability, hindering effective incremental learning.

Purpose of the Study:

  • To propose novel strategies to mitigate catastrophic forgetting in incremental learning.
  • To address the non-identifiability issue in softmax cross-entropy distillation loss.

Main Methods:

  • Introduced an imbalance-invariant distillation loss to counteract imbalanced weights during distillation.
  • Regularized prediction/distillation loss with shift-sensitive alternatives for problem identifiability.
  • Developed five novel approaches integrating into existing frameworks like LWF, LWM, and LUCIR.

Main Results:

  • Demonstrated consistent improvements in predictive accuracy across multiple incremental learning frameworks.
  • Achieved substantial reductions in forgetting rates in extensive numerical experiments.
  • On CIFAR-100, improved average accuracy by over 11% and reduced forgetting by over 16% for LWF, LWM, and LUCIR.

Conclusions:

  • The proposed strategies effectively mitigate catastrophic forgetting in incremental learning.
  • The novel approaches enhance the performance of distillation-based incremental learning methods.
  • The research offers practical solutions for building more robust incremental learning systems.